april 22, 2003 so damned busy
a lot of people seem to be curious about my time. perhaps i have my fingers in too many pots and projects. deadly readers want to know why we haven't moved to a new content system or why we can't come up with a more effective way to deal with the trolls. gentle readers here want to know why i'm still not writing up my spam analysis and why my aggregators sit at some early beta stage (and the code is not prime time at all). my editors at artech want to know why i don't reply to their emails and why i haven't delivered a completed manuscript yet.
where does jose's time go?
i miss the flexibility i had in grad school to pursue projects and random things. i'm writing better code now, i need to just find the time to work on it (ie my nsh improvements). i miss being able to go to random cons and see people.
april 21, 2003 early adopter ... i guess
jobo tells me i'm an early adopter. this is very shocking to me ... i'm usually not. at least i don't think i am. i'm lazy, i go for basics ... i wear simple clothes because i don't like shopping for new ones. it's easieto to go with simple styles than to always shop for the season's style. i like my computer work to be focused and simple. wireless and rss ... i get bummed when i can't find things i use readily available.
i was asking jobo about more sites to stuff into home schooled hacker. i've found myself really enjoying the content, but i want more. i want it to be more dynamic ... oh well, i can live. anyhow, he tells me so little is out there that focuses on testing and software development that syndicates in rss. to make matters worse i once had a disdain for rdf 2.0 but now i like it. rss 0.91 just doesn't give enough detail about the attributes about the content, like it's date and such. anyhow, he didn't have much to offer for sites i really wanted to add. good sites, just not appropriate for this one.
speaking of aggregation, i find myself not commenting on peoples' blogs i aggregate because it's now two clicks away to make a comment, not one. i wonder what blog aggregation on a larger scale will mean for feedback and comments or even things like trackback.
oh, and speaking of spam analysis ... i did finally download and start looking at the goods from the spam archives. you may be shocked but so far, everything i have been saying has held up. same TLD and 2LD dispersal, all of the other analysis is forthcoming ... and this is on about 15 times the spam from more sources. i really need to write this up, but i suspect places like the RBL and SPEWS wont give it any credit. oh well ...
sleep beckons. feedback always welcome ...
april 19, 2003 home schooled
home schooled hacker is a project i'm slowly working on. in a nutshell i'm using my aggregator stuff to bring me (and you, should you wish to see it) software development writings. i call it home schooled hacker since that's how i learned everything, more or less, by reading books and websites. so, why not automate the process of getting things to read.
so, what needs to be done? fix up the view so its easy to see what's new, what's good for that day, etc ... color schemes ... more content ... etc. suggestions always welcome. it's now going to be updated twice a day.
also, if you're a slashback user for deadly's rss, their dns is STILL bad (i wonder what's wrong with our dns at deadly). it's not our rss, it's the dns i think. everyone else who gets the rss does just fine with it.
april 17, 2003 more spam analysis
i took a few minutes last night (and again this morning to graph the results) to analyze the 1000s of spam i have collected. i was kind of boggled at how people think that blocking an entire TLD (ie .fr or .kr) can be a) effective at stopping spam or b) sane. perhaps i just have more friends all over the world. perhaps my spam is not like theirs. i don't know.
the two figures above show the top level and second level domains of the spam i analyzed, about 3000 messages (i have around 10000 collected right now, but some of it is offline ... will analyze later). this compares favorably to another spam corpus i have around but didn't graph here.
in a nutshell ... blocking a TLD seems pointless, dangerous, and useless at stopping an appreciable amount of spam. secondly, you can see that second level domain blacklists and whitelists are also pointless, stupid, and assinine. what are you going to do, block .com? block aol.com and stop your mother, cousin, etc from mailing you?
examine the payload, people, not the envelope.
april 15, 2003 producer ... finally
i am finally a producer of an RSS/RDF feed. at your left you will find an RSS feed (RSS 2.0) of this site. now enjoy ... several people asked me for it, here you go. it has one small bug for the dc:date tag, but that's it. should be good for most of your aggregators and RSS browsers.
april 13, 2003 i return
my cansecwest pics are up (see side bar) and my slides are also up (again see left at presentations). jim moved deadly to a new server, and i fixed the rss feed from it ... now you get content.
showed a bunch of people trogdor and aggie, and they seemed to like it. showed people my spam analysis, again they liked it ... pretty cool work going on with lots of others, too. felt good to be able to show off some stuff, too.
i am exhausted ...
april 9, 2003 away ... back later
i'm at cansecwest in vancouver. i'll be out of town and probably off the net for a while. please be patient ... i'll post pics when i get back. lots of pics. overdue pics.
april 6, 2003 about time
i finally brought up my macppc/netbsd machine and loaded it on the network. i have been getting rid of a lot of machines and hardware lately, i now have room for it upstairs. i can now also finally do the last CVS action on my thesis:
RCS file: /home/jose/cvs/mythesis/thesis.tex,v Working file: thesis.tex head: 1.11 branch: locks: strict access list: symbolic names: FINAL_DRAFT: 1.11 DEFENSE_COPY: 1.10 READER_COPY: 1.9 DRAFT_2: 1.7 FIRST_DRAFT: 1.6 start: 188.8.131.52 me: 1.1.1 keyword substitution: kv total revisions: 12; selected revisions: 12it feels so good to have that done.
jobo and lynn have bitched me out repeatedly for being a demanding consumer of RSS and XML but never a producer. maybe i'll actually write a system to translate this to XML ...
i hate mutt. a lot. it's everything that is wrong with free software, which i'm also growing to hate with a passion. but i'm sick and tired of pine's signal handling problems and lost mail. so ... mutt. :-/ i have a muttrc you may be interested in, close to pine ... i fucking hate having to tune software to work reasonably well. mutt is crap.
april 03, 2003 sleepless
hacked on trogdor this evening. after about an hour of python hacking i had a working prototype which i put into production. once a day harvesting ... known bugs right now: stef's got a bad xml feed, the parser breaks on her stuff. lambert's timestamp from radioland isn't matching my RE ... need to tweak that. joel on software is another one i need to work on matching better. however, i'm much happier with the layout, it now looks like a blog and reads easily. let's see how it does ... feedback welcome. i'll have to get the code wrapped up soon and make it available to people.
this is a stupid bug that cost me a few hours but was easy to fix. it's re.search(needle, haystack) and not re.search(haystack, needle). now i see why it never matched. oh, and some of the feeds have invalid RSS ...