The scrape until this point had been designed to use a brute-force technique to download every possible thread, existing or not, from the Megatokyo Forums . By exploiting the idea that old threads, upon conversion from Infocap’s ubb.Threads to Invision Power Board, are numbered the same as the first post within them, lets us take [...]
We’ve finally got the cwdb2/Import functionality working and semi-tested.
Currently the ability to pre-select what’s being imported, be it just the first post, or the whole thread, has been removed. In addition, the ability to import an entire thread with the parent-child relationships intact is gone. I’ll fix this later.
The usual applies: Import things and try [...]
I’ve been trying to find a more accessable, public discussion medium for items pertaining to the cwdb. So far, the ‘perfect solution’ has eluded me. Maybe experimentation will unveil it.
In the meantime, here’s an attempt at using phpBB.
windless.org/forum/CWDB Policy Discussions
Code, code, code. My whole life seems to be just a long string of code… Hash marks, delimited data. Recursive strings. Pointers… *
In other news, cwdb2 development continues. What we have left:
changelog – log all actions / modifications.
import – implement webforms for running the import engine. The backend is done.
authentication – required for administration of [...]
I was playing with the randomline view in the cwdb today, giving it a little more fun randomness. Specifically, 20% of the time you’ll get a random document from this day in history. Aka, a random year.
Kevan Davis has implemented (as of July2004) a very intresting hack based on del.icio.us’s bookmark entry tagging scheme. I wonder what feasibility exists for implementing something similarly cool with the cwdb’s datamining goodness.
Currently there is a small list (20?) of tags used to identify categories of works in the cwdb. Would the idea of a greater number of tags, on a scale of 3-4 times, be welcome?
I’ve been working on the cwdb’s search engine capabilities recently. In particular, I aim to generalize the search system to include the filtering botch in the last revision.
Mode00: http://cwdb.azaphrael.org/search.php
The original search engine. Quite crappy. Referenced here for comparison.
Mode01: http://cwdb.b4k4.ath.cx/search
The first revision, used as a benchmark on the new system. Uses MATCH() AGAINST() fulltext searching.
Mode02: http://cwdb.b4k4.ath.cx/search2
The [...]
Currently, cwdb entry moderation flagging is limited to a single meta value on each entry, listing the fixed values Approved and Not Approved. Would it be useful to the users to have a more flexible flagging system for tasks? Such as “Incomplete”, “Wrong Author/Date/etc”, “Clean up formatting”, “Duplicate check”, or all other manner of operator [...]
Unless given a profound reason for not doing so, I will be abolishing all cwdb user-levels in the next codebase revision, with the exception of “Administrator” and “Unauthenticated guest-type user”.
That is to say, insofar as editing entries goes, since we currently have a small team of trusted individuals taking care of all entries, regardless of [...]