KBD

Keith Devens .com

Friday, December 5, 2008 Flag waving
"Let me get this straight. He met with terrorists? Oh, that's good." – John Edwards (on finding out John Kerry's own diary testified to him meeting with North Vietnamese terrorists in Paris)

Archive: June 20, 2003

← June 18, 2003June 21, 2003 →

Daily link icon Friday, June 20, 2003

CSS3 Selectors

Via WebGraphics, check out this nifty article summarizing CSS selectors. It's a two page article -- the first page summarizes what we know from CSS1 and CSS2, and the second page goes on to what's new in the recently-released CSS3 specification. It turns out much of it has already been implemented in Mozilla (awesome).

Caching, Templates, and MVC

I'm starting to look into caching for my web site, now that my CMS is largely in a stable form. So I hopped over to smarty.php.net and they have some interesting looking articles. One on template engines and one on Industrial Strength MVC.

Current thoughts for my caching solution:

  • It needs to handle things like last-modified, etags, conditional gets (I'm not even sure what those are)
  • What about pages that can be mostly static except for a minor dynamic part?
  • How do I control the caching in a fine-grained matter, so that most pages can be cached for a long time and others possibly regenerated up to every 5 minutes or so?
  • "Who" in my CMS controls this? Is this a CMS-wide thing or do modules expose some kind of interface by which the system can determine what can and can't be cached?
  • How does the system do update notifications so that if new content is entered, the very next time the page loads that content can be shown, rather than waiting a few minutes for the cached copy to expire?
  • How can I integrate this with my CMS without having caching be so pervasive that it touches every piece of code?

One possible solution to some of these problems that I'd had in the back of my mind for some time has been an event-based system of some kind. So the CMS can have a set of event mappings set up so that whenever an event is fired, some code belonging to some module is run. An event would furthermore be specific to a module -- the only system-wide change that has to be made is that the system has to be the Ticktockman, running cron jobs or otherwise keeping track of events for the whole system by running them on the next page access, etc.

Since I assume the key for the cache for a particular page will be its URL, and each module "knows" the URL for any of its pieces, and as long as there's only one canonical url by which to access some piece of content (which is really how it should be), the module should know enough to handle this.

Events:

Events can be anything from "Invalidate cache for this item" to "Send subscribers an e-mail". Events should probably be represented in the CMS as either an object or a plain function. Each module, based on a certain event, will be able to say: $cms->fireEvent('event_name', $params); Events are similar to "actions" within my system, and I'm not totally sure right now that they're genuinely separate things.

Though, actions are really meant to handle forms and check user data (and therefore have extensive validation, data "coersion", and error handling facilities), while events will always be fired by the system and should need no parameter checking, validation, or error handling of their own. Also, events will tend to be more fine grained, and a single action will be likely to have multiple events associated with it. So an action which edits a weblog entry may invalidate the cache for that item, the daily, monthly, and yearly pages that reference that item, as well as the cache for the weblog home page, and e-mail subscribers to that post to boot. Oh yeah, and regenerate the RSS feed.

What's nice about my CMS, which may make it easier to do fine grained caching, is that each "section" is displayed individually, by a separate call to $cms->displaySection(). So, if an individual section can be invalidated, it may make it possible for me to have most of a page be static while still allowing some parts to be dynamic. However, this would probably mean that I still have to run the constructors for all of my modules, as well as display the main template and maybe nested templates until I finally get to the part that has to be dynamically generated. So, rather than just being able to dump out the content of a file which is the cache for a given request URL, I'd still have to execute large portions of my CMS even though the reason they normally have to run doesn't exist because the content they exist to generate is already cached.

Hmmmm....

Plus, I wonder what benefit I could get from Smarty, if any. Smarty's already done the caching thing, but I wonder if I'd be able to integrate it with my CMS and get all the features I want. I'll have to look into that more.

(discussion on Simon's site)

Ninjai back online

Ninjai is back online, but they only have chapters one and two available to watch. What were they up to again? Nine or Ten? Anyway, it's good news.

Teaching Programming

Via PHP Everywhere, ONJava.com: 'Head First Java' Author Interview.

So much learning could be so much more efficient (and fun) if we all paid attention to the needs and goals of our brains, rather than doing things a certain way because that's how everyone else does it. We've known forever (at least since Socrates anyway) that learning is at its weakest when the learner is a passive receiver of information. The learner has to be engaged and actively flexing some neurons. The brain is tuned to pay attention to novelty and chemistry. For example, if something really scary or exciting happens to you, it takes only one episode and you remember it forever. Yet no matter how hard we struggle to learn something technical, it often takes multiple, sometimes dozens, of exposures to the content before it really sinks in and you're able to recall it when you need to. That's your brain trying to do you a big favor.

Very little of the interview has to do with Java specifically; the interview has many great insights into teaching in general.

Concepts, Techniques, and Models of Computer Programming

Via LtU, via Slashdot, Concepts, Techniques, and Models of Computer Programming seems like it's a great book. Currently available for download (PDF) until the book is published in paper form.

Trees in SQL

It turns out there's another way to store hierarchical data in SQL that I didn't know existed. Basically, rather than storing the "parent" for each record (with the root having a NULL parent), it turns out you can store a tree by storing "left" and "right" values for each element of the tree, where left and right are numbers that would be assigned by a preorder traversal of the tree.

Via Simon, check out this article from SitePoint.com by Gijs Van Tulder: Storing Hierarchical Data in a Database, and via a comment on Simon's blog, check out database guru Joe Celko's explanation of the same thing.

The first article has more of a tutorial flavor, and has nice diagrams, but Celko's article goes a little more in depth. Also, Gijs only used a binary tree in his example, so I wasn't sure if it'd work for more-than-binary trees, but Celko's example shows that it can.

The only downside to this technique is that it's more of a pain to insert new rows or delete existing ones, but the benefits seem to outweigh the downsides, especially for large or heavily nested recordsets. This is really a great technique, and I'm glad that it's now in my repetoire. I didn't realize this was possible. Thanks for finding this, Simon Smiley

Unintentionally ironic quote of the day

Simon: Tim Bray's explanation of search engines used data "presented in XML format for readability". Ha.

LZW Compression Paper

Via 0xDECAFBAD, via Kuro5hin, A Technique for High-Performance Data Compression, from Computer, June 1984. Seems like a great place to look to really learn about compression.

Wiki URLs

Here I go again. Now that I'm so happy with how my weblog URLs work (I should never have to change them again, whoopie!), I'm moving on to my wiki URLs.

There are a few basic organizing principles of web content. The most basic principle I've come up with is that of dated versus undated material. Weblog posts are clearly dated. Therefore, there should be a date in the URL. Period. Which is why my weblog URLs work the way they do.

Press releases are dated. However, since press releases are a rarer thing than weblog posts, it's often unnecessary to have a full date directory structure like you'd want with weblogs. So a press release URL of something like domain.com/press_releases/2003-06-19 is reasonable. The point is that press releases are dated, and that should be reflected in the URL.

Articles are often dated. Their content is created on a certain date and apart from small updates or corrections the content is usually left alone. If a major update is done to the article, it'd typically be released as a new article with a new date.

The question you answer when you ask whether something is dated is this: "Is this content allowed to expire?" Or, whether it's ok if what the article says becomes "wrong" or out of date. A company could cease to exist, yet the content of a press release is still "valid", though it's superceded by what happened afterwards, because the press release is clearly dated. The press release isn't modified to say "By the way, this company doesn't exist anymore".

Weblog entries aren't updated either, except possibly to make a small correction, or to point to new content that supercedes the old.

Now, there are other types of content that aren't dated. The homepage of a site always has to be current. Main content pages should always be current. If something on a main contact page is inaccurate or out of date, it's simply wrong, even if it was right when the page was written.

Wiki pages fall into this category. A wiki page should always be current. Therefore there should be no date in the url.

The other primary organizing principle of web content is named verses unnamed. Not everything must to be named - usually, especially with database driven content, everything will have an ID which can serve to uniquely identify the resource. However, while names aren't necessary, they can be very useful in making it clear what the content of that resource refers to. You can use both keithdevens.com/weblog/archive/2003/Jun/20/WikiURLs and keithdevens.com/weblog/archive/2003/Jun/20/3985 to refer to this post. But the name is helpful giving a short mental anchor into the content of the post.

Wiki pages are unique, however, in that the name contained in page's URL is actually the title of the page as well. It's part of wiki simplicity, and is part of why wikis work so well.

Most wiki URLs contain names are just squished together words using StudlyCaps, BumpyCaps, CamelCase, what have you. The word comes to represent some concept - WikiWords are often neologisms. In fact, Wikis are often fertile sources of creativity in language. Today I just learned what a WikiBadge is.

The problem is, while some WikiWords refer to actual concepts that "belong" to be in StudlyCaps, like StudlyCaps Smiley, WikiBadge, etc. (surf around Wiki for more), many page names don't work well as StudlyCaps. So, PhpReference isn't a "new" thing, it's just a PHP Reference. So it'd be nice if the URL didn't look like it was creating new concepts all the time. And having every linked page be in StudlyCaps just isn't friendly to the reader. It's harder to read, and it keeps the wiki from looking like a normal web page.

So just now I tried creating a page with spaces in its name in my wiki and it worked just fine. The only thing is, plusses in a URL are ugly. I really like how the UnrealWiki does it. Page URLs have underscores instead of plusses for spaces. When the title of the page is rendered, underscores are replaced with spaces, and when a link is made in the text of the page it can be entered with spaces and when the page is actually rendered the spaces are automatically replaced with underscores.

So, I have two choices. I can either do no special processing on the URLs besides normal URL encoding, which would wind me up with plusses in URLs for pages with spaces in their names, or I can forego having underscores in wiki page names (not a big sacrifice) and convert all spaces to underscores, and on rendering and name resolution convert underscores to spaces. Again, I like how the UnrealWiki does it, so I may follow suit.

Ahh, this was relaxing to write... nothing too important or hard to think about Smiley

Oh yeah... comments?

See Also: http://www.keithdevens.com/wiki/WeblogUrls

← June 18, 2003June 21, 2003 →
December 2008
SunMonTueWedThuFriSat
 123456
78910111213
14151617181920
21222324252627
28293031 



RSS feed RSS feed for Keith's Weblog
Atom feed Atom feed for Keith's Weblog
Weblog archive
Recent comments
  on 4 posts

Recent comments XML

Girls, please don't get breast implants

I have 34 A breast but at 22 years​old they seem to be growing again​which ...

76.64.120.153: Dec 3, 10:00am

Perl 6 1.0 in March?

Doh, my mistake. I'm aware of the​relation between Parrot and Rakudo​but I'...

Keith: Dec 2, 1:03am

Free image hosting sites

Well, TinyPic has this in its​FAQ:

> Images and videos is in​your accoun...

Keith: Dec 1, 1:13am

Join a NameValueCollection into a querystring in C#

Well with a lamba expression, this​is what I came up​with:

?!code:csharp...

Gustaf Lindqvist: Nov 30, 4:38pm

Generated in about 0.077s.

(Used 7 db queries)

mobile phone