On templates and programming languages

As many folks have noted, our current templating system works ok for simple things, but doesn’t scale well — even moderately complex conditionals or text-munging will quickly turn your template source into what appears to be line noise…

<includeonly><span style="white-space: nowrap;">{{#if:{{{3|}}}|
{{coord|{{{1|0}}}|{{{2|0}}}|{{{3|0}}}|{{{4|N}}}|{{{5|0}}}|{{{6|0}}}|{{{7|0}}}|{{{8|E}}}|{{{9|type:other}}}|format={{{format|dms}}}|display={{#if:{{{title|}}}|inline,title|inline}} }}| {{#if:{{{2|}}}|
{{coord|{{{1|0}}}|{{{2|0}}}|{{{4|N}}}|{{{5|0}}}|{{{6|0}}}|{{{8|E}}}|{{{9|type:other}}}|format={{{format|dms}}}|display={{#if:{{{title|}}}|inline,title|inline}}}}| {{#if:{{{4|}}}|
{{coord|{{{1|0}}}|{{{4|N}}}|{{{5|0}}}|{{{8|E}}}|{{{9|type:other}}}|format={{{format|dec}}}|display={{#if:{{{title|}}}|inline,title|inline}}}}| {{#if:{{{1|}}}|
{{coord|{{{1|0}}}|{{{5|0}}}|{{{9|type:other}}}|format={{{format|dec}}}|display={{#if:{{{title|}}}|inline,title|inline}}}}}}}}}}}}</span></includeonly><noinclude>
{{pp-template|small=yes}}
{{documentation}}
</noinclude>

And we all thought Perl was bad!  😉

Lua

There’s been talk of Lua as an embedded templating language for a while, and there’s even an extension implementation.
One advantage of Lua over other languages is that its implementation is optimized for use as an embedded language, and it looks kind of pretty.
An inherent disadvantage is that it’s a fairly rarely-used language, so still requires special learning on potential template programmers’ part.
An implementation disadvantage is that it currently is dependent on an external Lua binary installation — something that probably won’t be present on third-party installs, meaning Lua templates couldn’t be easily copied to non-Wikimedia wikis.
There are perhaps three primary alternative contenders that don’t involve making up our own scripting language (something I’d dearly like to avoid):

PHP

  • Advantage: Lots of webbish people have some experience with PHP or can easily find references.
  • Advantage: we’re pretty much guaranteed to have a PHP interpreter available.  🙂
  • Disadvantage: PHP is difficult to lock down for secure execution.

JavaScript

  • Advantage: Even more folks have been exposed to JavaScript programming, including Wikipedia power-users.
  • Disadvantage: Server-side interpreter not guaranteed to be present. Like Lua, would either restrict our portability or would require an interpreter reimplementation. 😛

Python

  • Advantage: A Python interpreter will be present on most web servers, though not necessarily all. (Windows-based servers especially.)
  • Wash: Python is probably better known than Lua, but not as well as PHP or JS.
  • Disadvantage: Like PHP, Python is difficult to lock down securely.

Any thoughts? Does anybody happen to have a PHP implementation of a Lua or JavaScript interpreter?  😉
— brion
Update:
Hampton reminds me that Ruby has some sandboxing features and may also be a contender.

Archive notice: This is an archived post from blog.wikimedia.org, which operated under different editorial and content guidelines than Diff.

11 Comments
Inline Feedbacks
View all comments

One of the suggestions has been to use, or at least build on the Abuse Filter language.
Advantage: PHP implementation readily available.
Advantage: We have in-house expertise to modify it to our needs.
Advantage: It’s already used elsewhere in MediaWiki.
Disadvantage: Needs to be rewritten to improve performance. Victor is working on this.
Disadvantage: Not everybody knows it — certainly not to the same extent as PHP, Javascript or Python.

If the security issues can be handled I’d go for PHP for all the reasons you’ve stated all ready. I like the idea of having the tamplates written in the same language as the software it self. I also think Python has alot going for it, it is very easy to learn for one thing and the indentation makes for readable code, which is important in templates where alot of people are supposed to edit together..

For security reasons, I would suggest using a restricted implementation of the used language that only provides what is really needed. So using a language not usually present on a server should also be no problem.
For instance if the language/implementation does not offer a function to delete a file no one can abuse it to delete a file.
The other thing is whether you really need a Turing complete language for templates. Maybe something (much ?) smaller would be at least as good from a usability point of view and probably much less of a security risk.

Ruby would be just lovely ^_^ It’s already used on the mobile site – why not expand that to the wikis, too!

I think simplicity should be valued: choosing a language like PHP, JavaScript, or Python involves setting a higher learning curve for the syntax. One of the advantages of the current syntax is that it has a rather short learning curve: there are only a few basic methods and structures, and complex operations are merely a matter of forming compounds. (Whether those compounds quickly become unreadable is another issue.) Lua looks promising because it’s so readable and simple. If a suitable implementation can be found, it would seem to be the best option among those you’ve listed, for that reason. The… Read more »

Although I am a huge fan of Lua, I would have to say go with PHP for usability and more widespread familiarity and harmony with Mediawiki. Run it under ptrace with most of the syscalls disallowed, strict resource limits, and in a chroot jail.

Hmm, externally restricting a PHP process probably isn’t too easy to do in a portable way; what works on Linux won’t work on BSD/OS X and never mind Windows… 🙁

I saw this today and am no longer a huge fan of Lua.

“One thing that’s looks especially promising in the Lua extension is the ability for different templates on the same page to “talk” to each other through shared variables.” Ugh. So when will templates start throwing exceptions and passing arguments by reference? Unreadable as the current syntax might be, it is conceptually very simple: you have functions that receive parameters and return something based on those, with no variables and no side effects (that is, a very simple functional language). Providing a more effective language with a very different concept is a fair tradeoff as long as it can be isolated… Read more »

now thinking Andrew’s suggestion is the most parsimonious
Where is the abuse filter language documented?

http://www.mediawiki.org/wiki/Extension:AbuseFilter/RulesFormat
Now all we need is I/O routine(s) suitable for template argument processing and wikitext generation.