Converting WordPress to files
by Conner McDaniel
Recently my brother introduced me to a nifty idea. The idea is called Markdown, which is a text format that’s designed to be easily readable while at the same time having enough syntax to convert it to other file types (namely HTML). As is common with good ideas, others are working on making it even better. One such case is MultiMarkdown which attempts to allow the conversion of this text format to LaTeX, RTF, FODT and PDF. Furthermore, once a few people sunk their teeth into this epiphany, a HTML to Markdown converter was made so that others could convert their documents that were previously in HTML to this new format for storage and readability.
Right now, Markdown is still in its beginning stages and could really use some better stream-lining from products like MultiMarkdown to make it more popular, but I like the readability of the format more than the potential of conversion. For that reason, I decided that I’d pull all my HTML WordPress posts and convert them to this format for storage in case I ever changed the structure of my website. This proved to be tedious. Even getting the raw HTML out of WordPress is a bit tedious. WordPress has an export option, but it only allows you an XML file that you must then parse to get the real HTML out. So I had a sharp idea and ran with it. Why not do the HTML parsing and conversion at the same time? And furthermore, why not make it available to other people? So I wrote a quick script in PHP that will parse the WordPress Export XML and convert it to a desired format. There are a few options about how you want the folders and files to be named as well.
How to convert WordPress posts into files:
- Go to WordPress->Tools->Export
- Choose ‘Posts’ and then click ‘Download Export File’
- Fill out the form below with your desired settings
- Upload the XML file and click ‘Go’
This should download a ZIP file with your desired folder structure, file naming format, and conversion type. None of your files will be shared or stored on the server (they are deleted by the script after the download). If you’re having problems with it or are not sure how to use the options then ask in the comments below!
[...] comes pre-installed on Mac. You can check your version with python –version. Download the python-markdown package, extract it and install it [...]
I am very interested in using your script as I want to convert my website from WordPress to a Jekyll based site using Markdown.
When I import my WordPress XML file I immediately get the following error message. The folder it is trying to create, 1999/12, is the oldest on my site and therefore the first.
Let me know if I can provide any other information regarding this issue.
Thanks!
Mark
Hi Mark,
Are you sure you exported the “Posts” XML file. As for now my script only works if you export “Posts” only. It doesn’t work for a total WordPress export. If you are already doing this then could you provide the error message it’s giving you?
- Conner
I tried running your script on a WordPress export (posts only), but I got a “failed to open stream” error. Do you expect this to be working?
Cheers!
Hi, Ben. Hm… not sure why that’s not working. It is a bit old, but I checked and it still works for me. Does it give you a line number or file path?
I get a “file could not be moved” error. This would be a great script, as running it on the web is much easier that finagling with exitwp or other scripts.
I’m running the latest version of WordPress if it matters. Followed your instructions exactly.