Convert Docx to Markdown

I needed to convert a Docx file to Markdown, but Pandoc kept giving me this obnoxious error:

$ pandoc test.docx -o
pandoc: Cannot decode byte '\xae': Data.Text.Encoding.Fusion.streamUtf8: Invalid UTF-8 stream

However, you can use the tool unoconv to make an intermediary step to convert first to HTML and then to Markdown.

$ unoconv --stdout -f html test.docx | pandoc -f html -t markdown -o

On Ubuntu (And other Debian-based systems I would imagine)  you can get unoconv with a simple apt-get install unoconv.

Oh yea, and join the BDS movement to help Free Palestine from Apartheid Israel. Enjoy!


About Nahraf
Providing interesting insight into the world of Economics, Theology, Computer Science and Social phenomena.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: