Securita Roadmap

Well I checked in some source this evening from my work the other day. There’s lots of problems, and it’s so trivial right now it’s sad. It’s going to need some serious work still. But hey, that’s the fun part right?

What I need now are some people to lend a hand. I’ve defined the rdf file for now, and put together a sample. Obviously that needs to be placed somewhere that later on we can hook up an auto updater that can update just that file.

So anyone who can submit patches on getting to 0.2 are kindly asked to lean a hand. Could be a good step in bringing Mozilla into schools and other places where protecting children from some harmful content is needed.


Securita 0.1 in the works

Well I did some work on it today. It’s now in extension form (the old version, prior to Ben Goodger’s changes). Also using a “database” (array) of 18 keywords right now, with a fair amount of success.

Now the big topic will be creating a RDF schema and a method for scanning efficiently, and “fuzzy”. Allow me to expand:

We can’t just ban the page because of the word “ass”, but the word “ass”, and several other words could be potential page worth blocking. So what needs to be done is attach point values to all words (scientifically). Then based if the point value gets higher than 5.0, we block it. This is basically how SpamAssassin operates. So what I need is for someone to do some experimentation, and find out exactly what keywords to use, and what point values to attach to them. A nice thing would be a little C++ app that could be used to generate scores based on data. I’m rather open to suggestions on how to do this. So… give suggestions, code solutions. Submit them to me, be a hero.

The RDF schema also needs to contain a method field. Since regEx is extremely slow, and bloated, we obviously don’t want to do that more than we need to. So we have the option to use window.find(). By using that method, there’s a speed increase (with obvious limitations).

Perhaps in the future, changing the core engine to compiled binary would be better, but for now, we make do with javaScript. So far performance on a 1.8GHz system is actually not much slower at all, I really don’t notice it. But we will need some more keywords. I figure about 50-100, provided we use a scoring system like mentioned above.

So code is coming, hopefully an initial checkin soon, I’m just not ready yet, and busy. I’ve had about 3hrs today of free time to play, and that was my break from the academic books. More to come, but lets get the creative juices flowing.


Another week done

My goals for the weekend:

Catch up on email/IM’s, bugs I missed while sick (almost done)
See how far away from getting a few blocking problems resolved with relaunch
Follow up on some forum threads I fell behind on
Revise term paper on technology waste
Securita research (most likely no code)
Reserch for Biology Presentation (anyone have some good info on wolf reintroduction?)
Project Aquarius

Also wouldn’t mind working out, sleeping, and of course watching some TV 😀 .


Securita, I’m starting (finally)