Tag Archives: cwdb

Scrape Performance Tweaks

The scrape until this point had been designed to use a brute-force technique to download every possible thread, existing or not, from the Megatokyo Forums . By exploiting the idea that old threads, upon conversion from Infocap’s ubb.Threads to Invision Power Board, are numbered the same as the first post within them, lets us take [...]

cwdb2/Import

We’ve finally got the cwdb2/Import functionality working and semi-tested.
Currently the ability to pre-select what’s being imported, be it just the first post, or the whole thread, has been removed. In addition, the ability to import an entire thread with the parent-child relationships intact is gone. I’ll fix this later.
The usual applies: Import things and try [...]

cwdb and Fanfiction

I’ve been trying to find a more accessable, public discussion medium for items pertaining to the cwdb. So far, the ‘perfect solution’ has eluded me. Maybe experimentation will unveil it.
In the meantime, here’s an attempt at using phpBB.
windless.org/forum/CWDB Policy Discussions

code code code

Code, code, code. My whole life seems to be just a long string of code… Hash marks, delimited data. Recursive strings. Pointers… *
In other news, cwdb2 development continues. What we have left:

changelog – log all actions / modifications.
import – implement webforms for running the import engine. The backend is done.
authentication – required for administration of [...]

Today in History

I was playing with the randomline view in the cwdb today, giving it a little more fun randomness. Specifically, 20% of the time you’ll get a random document from this day in history. Aka, a random year.

Extispicious

Kevan Davis has implemented (as of July2004) a very intresting hack based on del.icio.us’s bookmark entry tagging scheme. I wonder what feasibility exists for implementing something similarly cool with the cwdb’s datamining goodness.

Tagging Scheme

Currently there is a small list (20?) of tags used to identify categories of works in the cwdb. Would the idea of a greater number of tags, on a scale of 3-4 times, be welcome?

Search Engine

I’ve been working on the cwdb’s search engine capabilities recently. In particular, I aim to generalize the search system to include the filtering botch in the last revision.
Mode00: http://cwdb.azaphrael.org/search.php
The original search engine. Quite crappy. Referenced here for comparison.
Mode01: http://cwdb.b4k4.ath.cx/search
The first revision, used as a benchmark on the new system. Uses MATCH() AGAINST() fulltext searching.
Mode02: http://cwdb.b4k4.ath.cx/search2
The [...]

Entry Moderation Flags

Currently, cwdb entry moderation flagging is limited to a single meta value on each entry, listing the fixed values Approved and Not Approved. Would it be useful to the users to have a more flexible flagging system for tasks? Such as “Incomplete”, “Wrong Author/Date/etc”, “Clean up formatting”, “Duplicate check”, or all other manner of operator [...]

User Levels

Unless given a profound reason for not doing so, I will be abolishing all cwdb user-levels in the next codebase revision, with the exception of “Administrator” and “Unauthenticated guest-type user”.
That is to say, insofar as editing entries goes, since we currently have a small team of trusted individuals taking care of all entries, regardless of [...]