From Computer Tyme Support Wiki

Revision as of 19:11, 24 February 2009 by Marc (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Were you Blacklisted and want to be removed?

Have you been blacklisted on our Hostkarma list? To check or be removed Click Here.

Creating White/Yellow/Black DNS lists for email systems in the fight against spam.

Free DNS host karma listing servers to provide information to the world about what servers are sending spam, nonspam, or a mix of spam and nonspam. This is a service of Junk Email Filter dot com. One of many technologies used in advanced email filtering.

LICENSE - Using these Lists is Free

This list is a copyrighted list. Unless you really load our servers and suck a lot of bandwidth use of these lists are almost free.

If you are a non-profit organization usage of this list is free. In fact, if you are a progressive nonprofit you might qualify for free spam filtering service as our way of helping to support progressive causes. (We determine what we consider progressive)

If you are not a commercial site you can use it for free.

If you are a small business you can use it for free. However we ask as a favor for using it for free that you thank us somewhere on your web site. (Not a requirement) Link to http://www.junkemailfilter.com.

If you have or serve over 1000 email accounts or make more that 250,000 DNS queries per day then contact us for pricing and licenses. Rsync copies are available in rbldnsd format. Contact support@junkemailfilter.com for access and pricing.

List Attitude

Different lists have different criteria for listing that to a large extent reflects the personality of the people behind the list. Some lists are angry lists where they list everything and if you got on their list it's your fault.

This list is not an angry list.

Our position is that if you are a spammer we want to block you. If you are not a spammer we want to make sure your email gets delivered. And if you have been hacked or have a virus we want to help you get back to normal and get you off our blacklist as quickly as we can. And if you never send spam we want you on our whitelist. To us it's all about delivering good email and blocking bad email. Our mission is to get it right and to be professional and friendly about it.

How to use the Lists

Junk Email Filter dot com provides several public lists -- one is a black list to block spam and the other is a white list to either pass nonspam/ham or to keep sites from being blocked. Blocking is done by IP address which is something spammers can't spoof. We look at email hosts as being one of these kinds:

hosts that generate only spam that we blacklist
hosts that generate a mix which we yellow list.
hosts that generate only nonspam which we whitelist

Our list server is hostkarma.junkemailfilter.com - this server returns several different results depending on what kind of listing it is. If the server returns 127.0.0.1 then it is whitelisted. You can accept the email without any further checking.

If the result is 127.0.0.3 then the host is yellow listed. Yellow listing means that host generates some spam and some nonspam (examples: yahoo.com, hotmail.com). What that means is that this host should never be blacklisted and that other IP based blacklists should be bypassed to prevent false positives.

If the result is 127.0.0.2 it is blacklisted - if the IP is listed here you can bounce it without further checking.

And if the result is 127.0.0.4 it is brownlisted which means it is on its way to being blacklisted but hasn't quite got there yet. But it might be worth a few points using SpamAssassin.

127.0.0.1 - whilelist - trusted nonspam
127.0.0.2 - blacklist - block spam
127.0.0.3 - yellowlist - mix of spam and nonspam
127.0.0.4 - brownlist - all spam - but not yet enough to blacklist

Like all IP based lists the tuples of the client's IP address are reversed in order and the blacklist name is appendend. So if you were to look up 1.2.3.4 you would query the DNS for the following hostname:

4.3.2.1.hostkarma.junkemailfilter.com

Name Based Lookups

In addition to IP based lookups the hostkarma list also supports name based lookups. If you wanted to look up wellsfargo.com, you would query the DNS for the following hostname:

wellsfargo.com.hostkarma.junkemailfilter.com

As with the IP lists a 127.0.0.1 is a white listing, 127.0.0.2 is a black listing.

List Logic

The best way to use the lists is to do it in a specific order:

First you check the white list and see if it is white. If so you accept the message without further processing. Then you see if the list is yellow. If so - you skip all your blacklist tests. Then you check your blacklists and if listed you bounce it. Whatever email is left is then tested with all your other testing methods like Spam Assassin.

Exim Examples

Exim is an extremely powerful MTA, probably the most powerful MTA on the planet. That's why I like it so much. I want to do what I want to do and Exim allows me to do it.

# Mark it White 
warn dnslists = hostkarma.junkemailfilter.com=127.0.0.1
     set acl_c_white = white - dnswl - $sender_fullhost

# Mark it Yellow 
warn dnslists = hostkarma.junkemailfilter.com=127.0.0.3
     set acl_c_yellow = yellow - $sender_fullhost

# Using the Black List
deny dnslists = hostkarma.junkemailfilter.com=127.0.0.2

# Other Blacklists
deny !dnslists = hostkarma.junkemailfilter.com=127.0.0.1,127.0.0.3
     dnslists = zen.spamhaus.org/<;$sender_host_address;$sender_address_domain :\
     nomail.rhsbl.sorbs.net/$sender_address_domain : cbl.abuseat.org :\ 
     list.dsbl.org : web.dnsbl.sorbs.net : socks.dnsbl.sorbs.net :\
     http.dnsbl.sorbs.net

Postfix Examples

Postfix For Blacklisting:

reject_rbl_client hostkarma.junkemailfilter.com=127.0.0.2

Postfix doesn't support whitelisting natively. You'd have to use a policy-daemon instead of reject_rbl_client.

Spam Assassin Examples

Spam Assassin can access the white and black lists for scoring.

header __RCVD_IN_JMF eval:check_rbl('JMF-lastexternal','hostkarma.junkemailfilter.com.')
describe __RCVD_IN_JMF Sender listed in JunkEmailFilter
tflags __RCVD_IN_JMF net
 
header RCVD_IN_JMF_W eval:check_rbl_sub('JMF-lastexternal', '127.0.0.1')
describe RCVD_IN_JMF_W Sender listed in JMF-WHITE
tflags RCVD_IN_JMF_W net nice
score RCVD_IN_JMF_W -5
 
header RCVD_IN_JMF_BL eval:check_rbl_sub('JMF-lastexternal', '127.0.0.2')
describe RCVD_IN_JMF_BL Sender listed in JMF-BLACK
tflags RCVD_IN_JMF_BL net
score RCVD_IN_JMF_BL 3.0
 
header RCVD_IN_JMF_BR eval:check_rbl_sub('JMF-lastexternal', '127.0.0.4')
describe RCVD_IN_JMF_BR Sender listed in JMF-BROWN
tflags RCVD_IN_JMF_BR net
score RCVD_IN_JMF_BR 1.0

Name Based DNS Lookup

The hostkarma DNS list supports name based lookups as well as IP based lookups.

<hostname>.hostkarma.junkemailfilter.com

127.0.0.1 = whitelisted
127.0.0.2 = blacklisted
127.0.0.3 = yellowlisted

Example: dig hermes.apache.org.hostkarma.junkemailfilter.com

Examples using Exim:

accept	dnslists = hostkarma.junkemailfilter.com=127.0.0.1/$sender_host_name
deny	dnslists = hostkarma.junkemailfilter.com=127.0.0.2/$sender_host_name

Examples using Postfix:

reject_rhsbl_sender hostkarma.junkemailfilter.com=127.0.0.2

No Blacklist List

We have also created a No blacklist list of IP and host names that are either white listed, yellow listed, or otherwise determined that these IP addresses should never be in any blacklist.

The purpose of the list is to avoid false positives. If you are running any kind of DNS list check you can read this list first and if it is listed then you need not test any other blacklists because they will be wrong.

127.0.0.1 = whitelisted - accept as good
127.0.0.3 = yellowlisted - mixed source - do not blacklist or whitelist
127.0.0.5 = nobl listed - not a spam source - do not blacklist - maybe whitelist

Any result from this list means do not blacklist. The list is accessed as follows:

accept	dnslists = nobl.junkemailfilter.com
....
blacklist tests

Both name and IP queries to this list are accepted:

4.3.2.1.nobl.junkemailfilter.com
mydomain.com.nobl.junkemailfilter.com

Experimental Return Codes

Our lists use a different philosophy than most lists. Instead of making separate calls over and over to separate lists we combine all our information into a single call. The theory is that this is far more efficient to return all the information in a single call reducing bandwidth and increasing speed through reduced number of calls.

The following are experimental codes that we are using internally. These may not be a list of all the return codes we use and we don't guarantee that we will continue to use these codes. But if we list these codes here it's because we have been using them for a while and finding them somewhat useful. If you want to use this information we would appreciate feedback on anything you find that might be interesting. There are 4 billion possible return codes so I don't think we are ever going to run out. Because we provide a lot of information any software that accesses our lists need to be prepared to receive and parse the multiple return codes. Here's an example of what you might see on a whitelisted domain:

dig wellsfargo.com.hostkarma.junkemailfilter.com

;; ANSWER SECTION:
wellsfargo.com.hostkarma.junkemailfilter.com. 2100 IN A 127.0.0.1
wellsfargo.com.hostkarma.junkemailfilter.com. 2100 IN A 127.0.1.1
wellsfargo.com.hostkarma.junkemailfilter.com. 2100 IN A 127.0.2.3

The results indicate that the domain wellsfargo.com is whitelisted (127.0.0.1), uses QUIT (127.0.1.1), and is familiar to us for over a week (127.0.2.3).

Tracking use of QUIT

Usually virus infected spam bots don't close the connection using the QUIT command. That's because the message is already sent and the spam bot isn't going to hang around and be polite and close the connection. This by itself is not sufficient to indicate a spam bot but it is a very important piece that when combined with other behaviors make spam bot detection both easy and accurate. We track both the host name and IP addresses so you cam use hostkarma to look up either one. The codes are as follows:

127.0.1.1 - QUIT is used
127.0.1.2 - No QUIT is used
127.0.1.3 - Mixed - Quit is used sometimes

We do have some mutual exclusion logic and do some counting and other refinements to improve the data. As this is experimental we are not ready to document further details. As with our lists data can be tested as follows:

4.3.2.1.hostkarma.junkemailfilter.com
example.com.hostkarma.junkemailfilter.com

Familiar Domains

Spammers often register new domain names and use them for spam. Most commonly they are used as links to sites that the spam wants you to click on. Many of these sites are fraud sides pretending to be banks so that they can get your account information and steal your money. But there is no easy way to get a list of new domains. Several people have tried but by the time they process the data the domains have been in operation for some time.

So instead of listing new domains what we are trying to do is list old domains in what we call our familiar list. The idea being that if the domain isn't listed then it is unfamiliar and thus new domains can be detected instantly upon being used. Of course this detects domains that are familiar to us so if an old domain contacts our servers for the first time they are also unfamiliar. So although we can detect 100% of new domains, not all domains detect as new are actually new. They are just new to us.

So keeping this in mind being unfamiliar isn't anything you would want to use for blocking but rather as one piece of information that when combined with other sins indicates that the unfamiliar domain is being used for fraud. We also track how long the domain is known to us so that creates an age indication that might be useful.

127.0.2.1 - domains we first saw today
127.0.2.2 - domains we first saw in the last 7 days
127.0.2.3 - domains that are older than 7 days

And, of course, if not listed then the domain is unfamiliar to us. Domains are read by reading the hostkarma list as follows:

example.com.hostkarma.junkemailfilter.com

Data Life

Blacklist data lives about 5 days so if you are wrongly blacklisted or if you had a virus and fixed he problem you will automatically be delisted 5 days after the spamming stops. White list data lives about 10 days. The exceptions being those who are permanently white listed or black listed.

Blacklist Compared

How does our lists compare to other lists. Here's some web sites where lists are compared.

[Jeff Makey's List]
[Spam Cannible]
[DNSBL Resources]

You can help us help you by building our list

If you want to participate in helping to build our lists and further reduce your spam you can participate in our project tarbaby.

Overview of the Lists

Unfortunately these lists are not the only solution to spam. But these lists are designed to be a front end to your spam filtering process allowing you to identify with great accuracy much of your incoming email. These lists have two purposes, one is to catch some spam, but more importantly these lists are used mostly to identify nonspam and to prevent mixed hosts from being blacklisted accidentally but our lists and others. One of the problems with spam filtering is that legitimate senders fail to get their email through because it is miscategorized as spam. These lists help prevent that from happening.

Most spam filtering technology is based on identifying spam, and whatever is left is nonspam. Our method also actively identifies nonspam as well as spam. By actively identifying nonspam it eliminates false positives and shrinks the number of messages that you have to work hard to identify with tools like SpamAssassin. These tools are processor intensive and require a lot of rules that do very well, but sometimes makes mistakes.

What Kinds of Spam Does this list Work With?

The black list catches spam only servers. Generally these include virus infected users who are being used as spam servers. The list is generated by honeypot accounts and spammer's behavior where spam is caught be dong things that only spammers do. We have developed a lot of unique methods of detecting spam based on the behavior of the spammer. We can detect spammers by the way they try to deliver email rather than by the content of the message.

The real power here is in the white lists. Those who are used to spam filtering need to think differently about spam processing in order to really get the idea. You have to understand that we are not just looking for spam. This list is to catch nonspam. Nonspam is actually easier in some ways because the nonspam servers aren't doing any tricks to hide. They consistently send out good mail. All we do is track that and once the server establishes a clean reputation we bless it.

We also have ways of detecting nonspam that spammers can't duplicate. We use these methods to build our white lists ensuring that good email gets delivered.

How the System Works

Telling all my tricks would be too long. But central to the system is tracking hosts by collecting data by IP address and doing an analysis on the information to determine the karma of the host.

The idea is that multiple trusted servers feed data to a database that tracks IP addresses and counts the number of spams/nonspams sent by these hosts. A spam increments the spam counter. A nonspam increments the nonspam counter. As the counts go up the servers develop a reputation. Those who spew only spam make the blacklist. Those who spew only good email make the white list. And those who spew a mix make the yellow list.

Other technology is also used. Honeypot can blacklist a virus infected server instantly allowing the system to have a very fast response time to new spam servers. The system can also track good servers over a long time tracking good email and establishing a reputation. Much of the blacklist data comes from using fake low and high MX records. When a host hits only the fake high numbered MX records without hitting the low numbered MX records the host is a virus infected spam zombie.

White and Yellow listing are also done using a table of domain names that are known to only send good email or are know to send mixed email, (yahoo, hotmail). The RDNS is looked up, the host name is verified to see that it matches the name returned, and if the name ends in a host that is on our list then we add the IP address to our white or yellow lists.

We are always looking to expand our white and yellow lists so if you send email and your server send only good email and you want to be on our lists, email me at marc@perkel.com with your host name information.

The Magic is in the White Lists

Think differently. It's not just about blocking spam - it's about accepting good email. The real power in this system is the white and yellow lists, not the black list. Envision this. A bank who sends nothing but good email is communicating with tens of thousands of customers on a regular basis. Their email goes to thousands of servers who host the customer's email. So lets say that 30 of these servers are feeding data to the database. After a few months the IP address of the bank's server has 100,000 good emails recorded and say 20 spams (some people will accidently report spam in error). Thus the bank can be whitelisted. Why bother to check email from a host like that for spam?

And it's not just banks. It's all institutions that send only good email. No one has to pay a fee to get listed. It's a karma system. You're good reputation gives you a fast pass through the filter.

Some serves send a mixture of spam and nonspam. Example are AOL, Yahoo, Hotmail, Comcast. People who sell email services or ISPs. They try to get rid of spam, but some people exploit them anyway. These are servers that make the yellow list. The messages still need to be spam tested, but because they have a reputation of sending some good email they can at least bypass blacklisting. Thus - if a Comcast customer starts spamming through Comcast servers and Comcast doesn't detect it, this system will at least keep the Comcast server from being blacklisted which would prevent other Comcast customers from having their email blocked.

Problems this Service Solves

One famous controversy over spam filtering is the battle between AOL/Goodmail vs. the Electronic Frontier Foundation. In this case both sides are wrong with EFF being a little more wrong than AOL. The Goodmail/AOL relationship is based on the idea that Goodmaill certifies email as good and AOL accepts it as good email. But there's $$$ involved and because of this EFF has accused AOL as trying to turn email into a paid service. Unfortunately EFF can't get beyond listening to themselves echo their own opinion to understand that the concepts behind AOL/Goodmail are at least partially sound. The idea is to get the good email through.

This system eliminates the need for AOL/Goodmail's system in that it automatically tracks good email from all servers and makes their karma available to the world. So rather than having to pay to get a reputation as a trusted server all you have to do is consistently send good email and when the world sees that then you get whitelisted. Problem solved.

Can Spammers Out Smart This System?

The short answer is yes - probably some can. However it represents yet another significant hurdle for them to cross. In reality this system will block mostly easy to detect spam sources. But - that's not where the power lies. it doesn't matter if spammers out smart this system. What this system does is protect good email from being falsely identified as spam and blocked. This isn't a spam filter as much as a ham filter. The power is in identifying good email.

To block spam you would just use this as a frontend to your system to preclassify the easy spam/ham and them pass the rest on to meaner tests. A spammer might be able to fake their way from being blacklisted to yellowlisted. But not all the way to whitelisted.

Spam DNS Lists