From Computer Tyme Support Wiki

(Difference between revisions)

Revision as of 18:19, 27 March 2008

Were you Blacklisted and want to be removed?

Have you been blacklisted on our Hostkarma list? To check or be removed Click Here.

Creating White/Yellow/Black DNS lists for email systems in the fight against spam.

Free DNS host karma listing servers to provide information to the world about what servers are sending spam, nonspam, or a mix of spam and nonspam. This is a service of Junk Email Filter dot com. One of many technologies used in advanced email filtering.

LICENSE - Using these Lists is Almost Free

This list is a copyrighted list. Unless you really load our servers and suck a lot of bandwidth use of these lists are almost free.

If you are a non-profit organization usage of this list is free. In fact, if you are a progressive nonprofit you might qualify for free spam filtering service as our way of helping to support progressive causes. (We determine what we consider progressive)

If you are not a commercial site you can use it for free.

If you are a small business you can use it for free if you post a link to us. Link to http://www.junkemailfilter.com.

If you have or serve over 1000 email accounts or make more that 250,000 DNS queries per day then you can license this service for $1000/year or $100/month and a link to our site.

Also, anyone who uses our lists or our data either directly from us or indirectly through a third party grants us a license for us to use your data from any lists that you might publish either directly or indirectly. If you are using ours we get to use yours.

For those of you who are large users we have a data feed that you can use. Email use about how to use it. support@junkemailfilter.com

List Attitude

Different lists have different criteria for listing that to a large extent reflects the personality of the people behind the list. Some lists are angry lists where they list everything and if you got on their list it's your fault. This list is not an angry list. Out position is that if you are a spammer we want to block you. If you are not a spammer we want to make sure your email gets delivered. And if you have been hacked or have a virus we want to help you get back to normal and get you off our blacklist as quickly as we can. And if you never send spam we want you on our whitelist. To us it's all about delivering good email and blocking bad email. Our mission is to get it right and to be professional and friendly about it.

How to use the Lists

Junk Email Filter dot com provides several public lists one is a black list to block spam and the other is a white list to either pass nonspam or to keep sites from being blocked. Blocking is done by IP address which is something spammers can't spoof. We look at email hosts as being one of they kinds, hosts that generate only spam that we blacklist, hosts that generate only nonspam which we whitelist, and hosts that generate a mix which we yellow list.

Our list server is hostkarma.junkemailfilter.com - this server returns several different results depending on what kind of listing it is. If the server returns 127.0.0.1 then it is whitelisted. You can accept the email without any further checking.

If the result is 127.0.0.3 then the host is yellow listed. Yellow listing means that host generates some spam and some nonspam. (examples, yahoo.com, hotmail.com) What that means is that this host should never be blacklisted and that other IP based blacklists should be bypassed to prevent false positives. If the result is 127.0.0.2 it is blacklisted - if the IP is listed here you can bounce it without further checking.

And if the result is 127.0.0.4 it is brownlisted which means it is on its way to being blacklisted but hasn't quite got there yey. But it might be worth a few points using SpamAssassin.

127.0.0.1 - whilelist - trusted nonspam
127.0.0.2 - blacklist - block spam
127.0.0.3 - yellowlist - mix of spam and nonspam
127.0.0.4 - brownlist - all spam - but not yet enough to blacklist

Like all IP based lists the IP address is presenter in reverse. So if you are looking up 1.2.3.4 you would request:

4.3.2.1.hostkarma.junkemailfilter.com

Name Based Lookups

In addition to IP based lookups the hostkarma list also supports name based lookups. If you wanted to look up wellsfargo.com ....

wellsfargo.com.hostkarma.junkemailfilter.com

As with the IP lists a 127.0.0.1 is a white listing, 127.0.0.2 is a black listing.

List Logic

The best way to use the lists is to do it in a specific order. First you check the white list and see if it is white. If so you accept the message without further processing. Then you see if the list is yellow. If so - you skip all your blacklist tests. Then you check your blacklists and if listed you bounce it. Whatever email is left is then tested with all your other testing methods like Spam Assassin.

Exim Examples

Exim is an extremely powerful MTA, probably the most powerful MTA on the planet. That's why I like it so much. I want to do what I want to do and Exim allows me to do it.

# Mark it White 
warn dnslists = hostkarma.junkemailfilter.com=127.0.0.1
     set acl_c_white = white - dnswl - $sender_fullhost

# Mark it Yellow 
warn dnslists = hostkarma.junkemailfilter.com=127.0.0.3
     set acl_c_yellow = yellow - $sender_fullhost

# Using the Black List
deny dnslists = hostkarma.junkemailfilter.com=127.0.0.2

# Other Blacklists
deny !dnslists = hostkarma.junkemailfilter.com=127.0.0.1,127.0.0.3
     dnslists = zen.spamhaus.org/<;$sender_host_address;$sender_address_domain :\
     nomail.rhsbl.sorbs.net/$sender_address_domain : cbl.abuseat.org :\ 
     list.dsbl.org : web.dnsbl.sorbs.net : socks.dnsbl.sorbs.net :\
     http.dnsbl.sorbs.net

Postfix Examples

Postfix For Blacklisting:

reject_rbl_client hostkarma.junkemailfilter.com=127.0.0.2

It appears PostFix doesn't support whitelisting but if this is wrong let me know and I'll correct this.

Spam Assassin Examples

Spam Assassin can access the white and black lists for scoring.

header __RCVD_IN_JMF eval:check_rbl('JMF-lastexternal','hostkarma.junkemailfilter.com.')
describe __RCVD_IN_JMF Sender listed in JunkEmailFilter
tflags __RCVD_IN_JMF net
 
header RCVD_IN_JMF_W eval:check_rbl_sub('JMF-lastexternal', '127.0.0.1')
describe RCVD_IN_JMF_W Sender listed in JMF-WHITE
tflags RCVD_IN_JMF_W net nice
score RCVD_IN_JMF_W -5
 
header RCVD_IN_JMF_BL eval:check_rbl_sub('JMF-lastexternal', '127.0.0.2')
describe RCVD_IN_JMF_BL Sender listed in JMF-BLACK
tflags RCVD_IN_JMF_BL net
score RCVD_IN_JMF_BL 3.0
 
header RCVD_IN_JMF_BR eval:check_rbl_sub('JMF-lastexternal', '127.0.0.4')
describe RCVD_IN_JMF_BR Sender listed in JMF-BROWN
tflags RCVD_IN_JMF_BR net
score RCVD_IN_JMF_BR 1.0

Name Based DNS Lookup

The hostkarma DNS list supports name based lookups as well as IP based lookups.

<hostname>.hostkarma.junkemailfilter.com

127.0.0.1 = whitelisted
127.0.0.2 = blacklisted
127.0.0.3 = yellowlisted

Example: dig hermes.apache.org.hostkarma.junkemailfilter.com

Examples using Exim:

accept	dnslists = hostkarma.junkemailfilter.com=127.0.0.1/$sender_host_name
deny	dnslists = hostkarma.junkemailfilter.com=127.0.0.2/$sender_host_name

Data Life

Blacklist Data lives about 3 days so if you are wrongly blacklisted or if you had a virus and fixed he problem you will automatically be delisted 3 days after the spamming stops. White list data lives about 7 days. The exceptions being those who are permanently white listed or black listed.

Overview of the Lists

Unfortunately these lists are not the only solution to spam. But these lists are designed to be a front end to your spam filtering process allowing you to identify with great accuracy much of your incoming email. These lists have two purposes, one is to catch some spam, but more importantly these lists are used mostly to identify nonspam and to prevent mixed hosts from being blacklisted accidentally but our lists and others. One of the problems with spam filtering is that legitimate senders fail to get their email through because it is miscategorized as spam. These lists help prevent that from happening.

Most spam filtering technology is based on identifying spam, and whatever is left is nonspam. Our method also actively identifies nonspam as well as spam. By actively identifying nonspam it eliminate false positives and shrinks the number of messages that you have to work hard to identify with tools like Spam Assassin. These tools are processor intensive and requires a lot of rules that do very well, but sometimes makes mistakes.

What Kinds of Spam Does this list Work With?

The black list catches spam only servers. Generally these include virus infected users who are being used as spam servers. The list is generated by honeypot accounts and spammer's behavior where spam is caught be dong things that only spammers do. We have developed a lot of unique methods of detecting spam based on the behavior of the spammer. We can detect spammers by the way they try to deliver email rather than by the content of the message.

The real power here is in the white lists. Those who are used to spam filtering need to think differently about spam processing in order to really get the idea. You have to understand that we are not just looking for spam. This list is to catch nonspam. Nonspam is actually easier in some ways because the nonspam servers aren't doing any tricks to hide. They consistently send out good mail. All we do is track that and once the server establishes a clean reputation we bless it.

We also have ways of detecting nonspam that spammers can't duplicate. We use these methods to build our white lists ensuring that good email gets delivered.

How the System Works

Telling all my tricks would be too long. But central to the system is tracking hosts by collecting data by IP address and doing an analysis on the information to determine the karma of the host.

The idea is that multiple trusted servers feed data to a database that tracks IP addresses and counts the number of spams/nonspams sent by these hosts. A spam increments the spam counter. A nonspam increments the nonspam counter. As the counts go up the servers develop a reputation. Those who spew only spam make the blacklist. Those who spew only good email make the white list. And those who spew a mix make the yellow list.

Other technology is also used. Honeypot can blacklist a virus infected server instantly allowing the system to have a very fast response time to new spam servers. The system can also track good servers over a long time tracking good email and establishing a reputation. Much of the blacklist data comes from using fake low and high MX records. When a host hits only the fake high numbered MX records without hitting the low numbered MX records the host is a virus infected spam zombie.

White and Yellow listing are also done using a table of domain names that are known to only send good email or are know to send mixed email, (yahoo, hotmail). The RDNS is looked up, the host name is verified to see that it matches the name returned, and if the name ends in a host that is on our list then we add the IP address to our white or yellow lists.

We are always looking to expand our white and yellow lists so if you send email and your server send only good email and you want to be on our lists, email me at marc@perkel.com with your host name information.

The Magic is in the White Lists

Think differently. It's not just about blocking spam - it's about accepting good email. The real power in this system is the white and yellow lists, not the black list. Envision this. A bank who sends nothing but good email is communicating with tens of thousands of customers on a regular basis. Their email goes to thousands of servers who host the customer's email. So lets say that 30 of these servers are feeding data to the database. After a few months the IP address of the bank's server has 100,000 good emails recorded and say 20 spams (some people will accidently report spam in error). Thus the bank can be whitelisted. Why bother to check email from a host like that for spam?

And it's not just banks. It's all institutions that send only good email. No one has to pay a fee to get listed. It's a karma system. You're good reputation gives you a fast pass through the filter.

Some serves send a mixture of spam and nonspam. Example are AOL, Yahoo, Hotmail, Comcast. People who sell email services or ISPs. They try to get rid of spam, but some people exploit them anyway. These are servers that make the yellow list. The messages still need to be spam tested, but because they have a reputation of sending some good email they can at least bypass blacklisting. Thus - if a Comcast customer starts spamming through Comcast servers and Comcast doesn't detect it, this system will at least keep the Comcast server from being blacklisted which would prevent other Comcast customers from having their email blocked.

Problems this Service Solves

One famous controversy over spam filtering is the battle between AOL/Goodmail vs. the Electronic Frontier Foundation. In this case both sides are wrong with EFF being a little more wrong than AOL. The Goodmail/AOL relationship is based on the idea that Goodmaill certifies email as good and AOL accepts it as good email. But there's $$$ involved and because of this EFF has accused AOL as trying to turn email into a paid service. Unfortunately EFF can't get beyond listening to themselves echo their own opinion to understand that the concepts behind AOL/Goodmail are at least partially sound. The idea is to get the good email through.

This system eliminates the need for AOL/Goodmail's system in that it automatically tracks good email from all servers and makes their karma available to the world. So rather than having to pay to get a reputation as a trusted server all you have to do is consistently send good email and when the world sees that then you get whitelisted. Problem solved.

Can Spammers Out Smart This System?

The short answer is yes - probably some can. However it represents yet another significan hurdle for them to cross. In reality this system will block mostly easy to detect spam sources. But - that's not where the power lies. it doesn't matter if spammers out smart this system. What this system does is protect good email from being falsely identified as spam and blocked. This isn't a spam filter as much as a ham filter. The power is in identifying good email.

To block spam you would just use this as a front ent to your system to preclassify the easy spam/ham and them pass the rest on to meaner tests. A spammer might be able to fake their way from being blacklisted to yellowlisted. But not all the way to whitelisted.

The Future of the Concept - The Big Picture

This system can make a huge difference in the accuracy of spam detection for the entire planet. Every email server on the planet - if it were scaled up - could access these lists and eliminate some 50% of all spam and identify some 95% of all nonspam with 100% accuracy with extremely little effort. To do it right would take several major partners getting involved and better programmers than me to do it right.

Here's what it would take:

You would have a central (replicated) MySQL cluster that is big and hardened and secret and immune from DOS attacks. This is where the data for the lists are kept. If done right it might run on less that $10,000 worth of hardware.

As a front end to this are a number of MyDNS servers and caching front end servers that connect to the databases on the back end and providing a front end for email servers all over the world to access. It would also take some smart people and many servers running Spam Assassin to check the quality of the lists and verifying that the lists remain accurate. And it might take a few people to watch over it to make sure there aren't any problem and some programmers to adapt to spammers who will always try to beat the system.

This isn't going to solve the spam problem. But if done right it will significantly reduce the false positive problem allowing for far greater front end accuracy. This will greatly reduce system load and make the remaining email easier to process.

@@ Line 19: / Line 19: @@
 * If you have or serve over 1000 email accounts or make more that 250,000 DNS queries per day then you can license this service for $1000/year or $100/month and a link to our site.
-<font color=red>Also, anyone who uses our lists or our data either directly from us or indirectly through a third party grants us a license for us to use your data from any lists that you might publish either directly or indirectly.</font>
+<font color=red>Also, anyone who uses our lists or our data either directly from us or indirectly through a third party grants us a license for us to use your data from any lists that you might publish either directly or indirectly.</font> If you are using ours we get to use yours.
 For those of you who are large users we have a data feed that you can use. Email use about how to use it. [mailto:support@junkemailfilter.com support@junkemailfilter.com]

Spam DNS Lists