mirror of
https://github.com/lleene/hugo-site.git
synced 2025-01-23 12:02:22 +01:00
76 lines
2.3 KiB
Markdown
76 lines
2.3 KiB
Markdown
|
---
|
||
|
title: "Latex to Markdown"
|
||
|
date: 2022-04-28T13:42:40+02:00
|
||
|
draft: false
|
||
|
tags:
|
||
|
- markdown
|
||
|
- latex
|
||
|
- code
|
||
|
- python
|
||
|
- hugo
|
||
|
---
|
||
|
|
||
|
Recently I started porting some of my latex articles to markdown as they would
|
||
|
make a fine contribution to this website in simpler format. Making a simple
|
||
|
parser python isn't that bad and I could have used [Pandoc](https://pandoc.org/index.html)
|
||
|
but I wanted a particular format for rendering a hugo markdown page. So I
|
||
|
prepared several regex-based functions in python to dereference and construct
|
||
|
a hugo-compatible markdown file.
|
||
|
|
||
|
``` python3
|
||
|
class LatexFile:
|
||
|
def __init__(self, src_file: Path):
|
||
|
sys_path = path.abspath(src_file)
|
||
|
src_dir = path.dirname(sys_path)
|
||
|
src_file = path.basename(sys_path)
|
||
|
self.tex_src = self.flatten_input("\\input{" + src_file + "}", src_dir)
|
||
|
self.filter_tex(sys_path.replace(".tex", ".bbl"))
|
||
|
|
||
|
def filter_tex(self, bbl_file: Path) -> None:
|
||
|
"""Default TEX filterting proceedure."""
|
||
|
self.strip_tex()
|
||
|
self.preprocess()
|
||
|
self.replace_references(bbl_file)
|
||
|
self.replace_figures()
|
||
|
self.replace_tables()
|
||
|
self.replace_equations()
|
||
|
self.replace_sections()
|
||
|
self.postprocess()
|
||
|
```
|
||
|
|
||
|
The general process for converting a Latex document is outlined above. The
|
||
|
principle here is to create a flat text source which we then incrementally
|
||
|
format such that Latex components are translated correctly.
|
||
|
|
||
|
|
||
|
## Latex Components
|
||
|
|
||
|
In order to structure the python code I created several named-tuples for
|
||
|
self-contained Latex contexts such as figures, tables, equations, etc. then
|
||
|
by adding a `markdown` property we can replace these sections with hugo
|
||
|
friendly syntax using short-codes where appropriate.
|
||
|
|
||
|
``` python3
|
||
|
class Figure(NamedTuple):
|
||
|
"""Structured Figure Item."""
|
||
|
|
||
|
span: Tuple[int, int]
|
||
|
index: int
|
||
|
files: List[str]
|
||
|
caption: str
|
||
|
label: str
|
||
|
|
||
|
@property
|
||
|
def markdown(self) -> str:
|
||
|
"""Markdown string for this figure."""
|
||
|
fig_str = ""
|
||
|
for file in self.files[:-1]:
|
||
|
fig_str += "{{" + f'< figure src="{file}" width="500" >' + "}}\n"
|
||
|
fig_str += (
|
||
|
"{{"
|
||
|
+ f'< figure src="{self.files[-1] if self.files else ""}" title="Figure {self.index}: {self.caption}" width="500" >'
|
||
|
+ "}}\n"
|
||
|
)
|
||
|
return fig_str
|
||
|
```
|