Securita Roadmap

Well I checked in some source this evening from my work the other day. There’s lots of problems, and it’s so trivial right now it’s sad. It’s going to need some serious work still. But hey, that’s the fun part right?

What I need now are some people to lend a hand. I’ve defined the rdf file for now, and put together a sample. Obviously that needs to be placed somewhere that later on we can hook up an auto updater that can update just that file.

So anyone who can submit patches on getting to 0.2 are kindly asked to lean a hand. Could be a good step in bringing Mozilla into schools and other places where protecting children from some harmful content is needed.
Continue reading

Securita 0.1 in the works

Well I did some work on it today. It’s now in extension form (the old version, prior to Ben Goodger’s changes). Also using a “database” (array) of 18 keywords right now, with a fair amount of success.

Now the big topic will be creating a RDF schema and a method for scanning efficiently, and “fuzzy”. Allow me to expand:

We can’t just ban the page because of the word “ass”, but the word “ass”, and several other words could be potential page worth blocking. So what needs to be done is attach point values to all words (scientifically). Then based if the point value gets higher than 5.0, we block it. This is basically how SpamAssassin operates. So what I need is for someone to do some experimentation, and find out exactly what keywords to use, and what point values to attach to them. A nice thing would be a little C++ app that could be used to generate scores based on data. I’m rather open to suggestions on how to do this. So… give suggestions, code solutions. Submit them to me, be a hero.

The RDF schema also needs to contain a method field. Since regEx is extremely slow, and bloated, we obviously don’t want to do that more than we need to. So we have the option to use window.find(). By using that method, there’s a speed increase (with obvious limitations).

Perhaps in the future, changing the core engine to compiled binary would be better, but for now, we make do with javaScript. So far performance on a 1.8GHz system is actually not much slower at all, I really don’t notice it. But we will need some more keywords. I figure about 50-100, provided we use a scoring system like mentioned above.

So code is coming, hopefully an initial checkin soon, I’m just not ready yet, and busy. I’ve had about 3hrs today of free time to play, and that was my break from the academic books. More to come, but lets get the creative juices flowing.

Another week done

My goals for the weekend:

Catch up on email/IM’s, bugs I missed while sick (almost done)
See how far away from getting a few blocking problems resolved with MacVillage.net relaunch
Follow up on some forum threads I fell behind on
Revise term paper on technology waste
Securita research (most likely no code)
Reserch for Biology Presentation (anyone have some good info on wolf reintroduction?)
Project Aquarius

Also wouldn’t mind working out, sleeping, and of course watching some TV 😀 .

Securita, I’m starting (finally)

Well, I’ve been delayed by one thing or another for quite some time. Anyway, I’m starting to gather my thoughts, and the various emails and conversations I’ve had with people over the project for quite some time. Just to review, Securita is a project to create a content filter for Mozilla.

The first checkin will not happen until I’m at version 0.1. Simply because it’s to messy, and a pain in the butt. At 0.1, I’ll checkin the source, and perhaps add a few devs to the project as appropriate.

The goals for 0.1

  • Load RDF datafile
  • Scan page for matches to RDF datafile
  • Display error when scan returns true
  • Make XPI for Firefox/Seamonkey

At this point, I’ve got a mini engine that can scan (including regexp, thanks to caillon) and return true if there’s a match (from an array of items).

My time is slightly limited right now, but I don’t want my short time to hold this project up any more. So what I need is the following:

Most Wanted #1

Method to load RDF datafile, and loop through for each element. Sample

Brief Rundown
String: what to search for
Scan: type of scan to perform. Either string or regexp
Type: URL, text, image , hybrid

Simply get to the point of doing a demo loop like follows:

for (i=0; i<ELEMENTS; i++){
  scan(string[i], scan[i], type[i]);
}

Most Wanted #2

Also need a method to link the script that does the processing (filterRun.js) to execute on each page loaded. I’m not exactly sure how to do this, and time is a bit short right now. If anyone can write a simple extension boilerplate that attaches the script filterRun.js to every tab that contains the following code to be executed as the page loads:

alert("Securita beats up the butterfly");

Anyone who can contribute the these two things would be extremely helpful. Email submissions as per my contact info on this website.

Securita

Quite some time ago, I started the Securita project, to implement a word filter, capable of providing adequate protection to those who wish to employ such technology.

Not much has happened in that project.

I believe, with Firebird approaching 1.0 sooner than people think (it’s not that far off), now is the time to seriously consider getting such an extension of the ground. There is demand for such a product, as I’ve gotten several emails in recent months regarding the status of the project. Just this evening, listening to the Computer Outlook Radio broadcast, (where several Mozilla Foundation Employees talked). 39:00 into the broadcast, there’s a mention of such a product, and the admission we don’t have one. Unfortunately, A good deal of the blame goes to me.

But I want to rectify that.

Unfortunately, I’m still not enough of a programmer to be laying out the source code. So I am making a request for someone to aid me in getting this project going full speed. Those interested, should contact me.