Just wanted to start brainstorming for my CMS as I head into the bath (first bath -- or shower -- in a few days, finally). If you've been reading my site for a while, you know I've been working on my CMS for a long time.
To recap: first version, which is running this site, was done in PHP. I've decided PHP sucks big time (and I've blogged a lot to that effect), so I'm rewriting it in Python, using mod_python and Cheetah. That's been largely stalled because of lots of other things in my life (school, other projects, etc.), but the good thing is that I've gotten to do lots of thinking about it, and the design will be similar, but different in important ways, to what I have now running my web site. Ultimately, the whole thing's going to take a very long time, because it's kind of a lot of code, it's fairly complex (which means that it's not just coding, but doing a lot of design at the same time, which takes longer), and it involves rewriting a lot of stuff I already have in PHP, particularly my StructuredText library, which is a big project in itself.
Anyway, to my brainstorming:
Stuff that has to be modelled:
- Configuration - maps URLs to applications (or templates, or components).
- Data/Settings
- Templates
- Nested arbitrarily
- Require mapping of "sections" to other templates -- part of configuration
- That makes for three separate config files so far: paths -- mapping URLs to applications, settings, and "mappings" -- mapping sections in templates to other templates
- The template should be separate from every other part of the CMS, and it should look to a template writer as if all the data in the CMS is available to the template at once. Movable Type is the prime example of this. In other words, data shouldn't have to be pre-populated, since it creates more coupling between the template and external configuration for the template, and makes for more maintenance. Furthermore, it's possible that some parts of the template won't get evaluated because of conditional blocks, and all the data won't always be necessary. The template has to request data from the CMS by name (in the form, "module.data") and the CMS figures out how to supply it to the template. But the nice thing is that it all happens transparently from the template's perspective because Cheetah and Python are awesome.
- Code
- Split into:
- web applications (primary entry point for a URL)
- models (model the domain, without regard to the web environment. i.e. generate no output directly, don't deal with sessions, etc.)
- modules (code available for the templates to access or for the application to use. Uses models directly)
- libraries
- and maybe even "chunks" of code that can be used independently and evalled or whatever. This can be just small snippets of code that get evalled, or can be entire files that act a lot like CGI apps and do most of everything themselves (though of course they could use existing models, modules, etc. if they want to). This would be appropriate in cases where the separation into applications, modules, and templates is overkill for something simple. Something like generating an RSS or Atom feed might fit this description. An entire file "chunk" will probably be called a "component". So you might have your RSS feed component, or your weblog API component, etc.
- Content
- Coming from various sources including:
- files in the filesystem
- RDBMS
- data/configuration
Other aspects:
Everything above is static (even if it generates content dynamically). So, to enable forms, data input, etc. you need "Actions".
For optimization, you need caching. Nice thing about mod_python is that I can cache stuff right in memory if I want to. Though, I still don't know how to best reload modules, or cleanly start up a fresh interpreter instance if I want to.
For user-specific stuff you need cookies and sessions. It's probably appropriate for the CMS to have facilities for multi-user capabilities, ACL (access control lists), etc.
Finally, there are "controllers" -- separate from the main, or "front", controller which serves every request -- that execute very early in the request so they can muck with the request at a low level before other parts of the CMS get to deal with it. For instance, implementing a "URL translation layer" would fit here so if you wanted to keep old URLs working you could do it without having to hack different URL formats into your "application". This controller could do an external redirect or an internal redirect. This example would really be similar to some of what you can do with Apache with tools like mod_rewrite, but implemented at the CMS level.
An important aspect of the system that I've recently become aware of is that an application should be able to be implemented in "layers". In other words, you should be able to add parts to your application depending on how complex your needs are, without everything needing to implement all possible parts. For instance, at base, you can simply have a template (or a component) mapped to a URL. If the template needs no data, you're done. At the next level, all data for the template could come from settings. Next, you could add a module. If your needs are complex enough, you could separate your code into any number of models available to your module. Somewhere along that line, if you need to do things like mess with cookies or sessions, or serve multiple URLs from that base URL (like with my weblog -- "/weblog/archive/2004/01/04"), you could make that URL served by an application. "Actions" are really separate, so they can be hooked in whereever.
There are also some other concerns which have to fit somewhere, maybe things such as post filters, and particularly error pages. There should be a system-wide 404 page that will be at a URL like /errors/404. However, these types of things might simply belong outside of the CMS, though something should be output by the CMS in case of error, nothing anchored at a URL, etc. So there should at least be some classes of errors which the CMS and every application should be expected to understand, such as a "not found" error. For instance, every application, if given a URL it doesn't understand, should be able to tell the CMS that it doesn't accept the URL, in which case a 404 would be generated. This happens now in my current CMS (try going to /weblog/foo).
Finally, the concept of "namespaces" is very important. Certain things all share the same namespace. For instance, each module is its own namespace, but settings are also part of that namespace. So, you can't have a module function and a setting value with the same name if they're in the same namespace (well, you can, but one will be "invisible" unless you get it explicitly). Template mappings, however, are in a completely separate namespace.
Feel free to post a comment below. Please see my comment policy.
Formatting Rules (No HTML):