Spam DNS Lists

From Computer Tyme Support Wiki

(Difference between revisions)
Jump to: navigation, search
m
(Blacklist Compared)
(132 intermediate revisions not shown)
Line 1: Line 1:
-
<h1>Creating White/Yellow/Black DNS lists for email systems in the fight against spam.</h1>
+
= Were you Blacklisted and want to be removed? =
 +
 
 +
Have you been blacklisted on our Hostkarma list? <font color=red><b>To check or be removed [http://ipadmin.junkemailfilter.com/remove.php Click Here].</b></font>
 +
 
 +
= Creating White/Yellow/Black DNS lists for email systems in the fight against spam. =
Free DNS host karma listing servers to provide information to the world about what servers are sending spam, nonspam, or a mix of spam and nonspam. This is a service of [http://www.junkemailfilter.com Junk Email Filter dot com]. One of many technologies used in advanced email filtering.<br>
Free DNS host karma listing servers to provide information to the world about what servers are sending spam, nonspam, or a mix of spam and nonspam. This is a service of [http://www.junkemailfilter.com Junk Email Filter dot com]. One of many technologies used in advanced email filtering.<br>
-
== Using these Lists is Almost Free ==
+
== LICENSE - Using these Lists is Free ==
-
Unless you really load our servers and suck a lot of bandwidth use of these lists are almost free. The price of using this list is that you have to post a link on your web site thanking us for the use of the list and linking to [http://www.junkemailfilter.com http://www.junkemailfilter.com]. <font color=red><b>Your link and your thank you is your license fee</b></font>.
+
Unless you really load our servers and suck a lot of bandwidth use of these lists are almost free.  
 +
 
 +
* If you are a non-profit organization usage of this list is free. In fact, if you are a progressive nonprofit you might qualify for free spam filtering service as our way of helping to support progressive causes. (We determine what we consider progressive)
 +
 
 +
* If you are a small business you can use it for free. However we ask as a favor for using it for free that you thank us somewhere on your web site. (Not a requirement) Link to [http://www.junkemailfilter.com http://www.junkemailfilter.com].
 +
 
 +
* Rsync copies are available in rbldnsd format. Contact [mailto:support@junkemailfilter.com support@junkemailfilter.com] for access and pricing.
 +
 
 +
== List Attitude ==
 +
 
 +
Different lists have different criteria for listing that to a large extent reflects the personality of the people behind the list. Some lists are angry lists where they list everything and if you got on their list it's your fault. There are also lists that have nothing to do with spam, but try to punish behavior that they don't like, or try to promote technologies that do not work.
 +
 
 +
<font color=green><b>This list is not an angry list. We focus on the reality of what really works.</b></font>
 +
 
 +
Our position is that if you are a spammer we want to block you. If you are not a spammer we want to make sure your email gets delivered. And if you have been hacked or have a virus we want to help you get back to normal and get you off our blacklist as quickly as we can. If your server is misconfigured, we want to help you get it right so that your good email can be delivered as efficiently as possible. And if you never send spam we want you to be on our whitelist. To us it's all about delivering good email and blocking bad email. Our mission is to get it right and to be professional and friendly about it. And because there is so much spam out there, we want to partner with our competitors so that we can all keep our customers happy and the spammers unhappy.
 +
 
 +
Our system also sends out automated notices to alert spam sources of problems and to get feedback in case we have a problem rejecting good email so that we can fix problems that we don't know about. This helps ISPs and office network admins find and shut down virus infected computer reducing spam across the planet. Our view is that the best way to fight spam is to stop it at the source.
 +
 
 +
[http://www.junkemailfilter.com Junk Email Filter] uses innovative techniques to fight spam. We have been the leader in introducing several new spam fighting technologies. This list is an example of our commitment not just to be accurate but to be efficient. Most lists are just black lists. A few have multiple return codes as to why the IP is blacklisted. There are also a for white lists but most of those white lists are really lists of servers not to blacklist. Our lists go much farther.
 +
 
 +
# Besides just black lists and white lists we have yellow lists and NOBL lists. (Our Invention) White on our lists means that anything that comes from the source is good email and needs no further testing. NOBL is like most other's white lists but means this IP or host name should not be black listed. So no need to check the black lists. Yellow listing indicates a mixed source of good email and spam. Sources like Hotmail, Yahoo, and Gmail are yellow sources. Yellow means that the IP address or host name contains no information about if it is good or bad and no reason to check white or black lists.
 +
# Instead of having a lot of separate lists which would require multiple DNS lookups we support a single DNS look up and we return different codes as to the status of the IP. In some cases we return multiple codes indicating the IP meets multiple conditions. At this time we are the only DNS list that does this. However we can not ignore the efficiency of a single call lookup and we think that this is a model for the future.
 +
# Forward Confirmed reverse DNS (FCrDNS) lookup is almost impossible to spoof. We therefore think there is an opportunity to provide host name lookup based on FCrDNS not just for blacklisted names, but for all the other colors to and like the IP lookups we return multiple results in a single DNS call that indicates everything we have that is useful information in a single call.
 +
# Unlike most lists and spam filtering systems who focus on black lists, we focus on white lists as well as NOBL and Yellow lists to actively detect good email and protect good email from being misclassified. It's not just a matter of catching spam and letting everything else pass. We actively detect good email and pass it through as quickly and efficiently as possible. This allows us to pass good email faster by avoiding unnecessary spam tests on email we can easily determine is good. This is a philosophy we have worked hard to instill in the spam fighting community.
 +
 
 +
=== Retaliatory Listings ===
 +
 
 +
We do everything possible to make sure that legitimate email is not blocked by our lists and we expect those who run lists to do the same. If any other list knowingly blacklists our IP space or fails to remove blacklists against us after being informed of their error we reserve the right to blacklist your IP space.
 +
 
 +
=== Types of Listings ===
 +
 
 +
Our listings are a little different than some RBLs. Instead of having you make several DNS calls to test for white or black listing we return it in one call and we have different result codes based on the reputation of the host name or IP being looked up. We also provide other general information that isn't black or white but might be useful in determining if something is good or not in combination with other tests. Here's some features of our lists.
 +
 
 +
* Black Lists - As with other lists we black list IPs, and host names!
 +
* White Listings - We list good IPs as well as bad
 +
* Yellow Listing - For Google, Outlook, Hotmail, Yahoo - IP contains no useful information.
 +
* Quit / NotQuit - Do they close the connection Properly?
 +
* Name Based Lookups - Not just IPs but host names too!
 +
* Domain Age - If the host name is familiar or not. Can be used to catch spammers using newly registered domains
 +
* TLDs - test for legit Top Level Domains
 +
* Registry Barriers - Test to see where the main domain stops and the subdomain starts.
 +
* Country Codes - Look up what country the IP came from.
== How to use the Lists ==
== How to use the Lists ==
-
[http://www.junkemailfilter.com Junk Email Filter dot com] provides 2 public lists one is a black list to block spam and the other is a white list to either pass nonspam or to keep sites from being blocked. Blocking is done by IP address which is something spammers can't spoof. We look at email hosts as being one of they kinds, hosts that generate only spam that we blacklist, hosts that generate only nonspam which we whitelist, and hosts that generate a mix which we yellow list.
+
[http://www.junkemailfilter.com Junk Email Filter dot com] provides several public lists -- one is a black list to block spam and the other is a white list to either pass nonspam/ham or to keep sites from being blocked. Blocking is done by IP address which is something spammers can't spoof. We look at email hosts as being one of these kinds:
-
Our blacklist server is hostkarma.junkemailfilter.com with result is 127.0.0.2 - if the IP is listed here you can bounce it without further checking.
+
* hosts that generate only spam that we blacklist
 +
* hosts that generate a mix which we yellow list.
 +
* hosts that generate only nonspam which we whitelist
-
Our whitelist server is hostkarma.junkemailfilter.com - this server returns two different results. If the server returns 127.0.0.1 then it is whitelisted. You can accept the email without any further checking. If the result is 127.0.0.3 then the host is yellow listed. Yellow listing means that host generates some nonspam. What that means is that this host should never be blacklisted and that other IP based blacklists should be bypassed to prevent false positives.
+
Our list server is <font color=red>hostkarma.junkemailfilter.com</font> - this server returns several different results depending on what kind of listing it is. If the server returns 127.0.0.1 then it is whitelisted. You can accept the email without any further checking.  
 +
 
 +
If the result is 127.0.0.3 then the host is yellow listed. Yellow listing means that host generates some spam and some nonspam (examples: yahoo.com, hotmail.com). What that means is that this host should never be blacklisted and that other IP based blacklists should be bypassed to prevent false positives.
 +
 
 +
If the result is 127.0.0.2 it is blacklisted - if the IP is listed here you can bounce it without further checking.
 +
 
 +
And if the result is 127.0.0.4 it is brownlisted which means it is on its way to being blacklisted but hasn't quite got there yet. But it might be worth a few points using SpamAssassin.
* 127.0.0.1 - whilelist - trusted nonspam
* 127.0.0.1 - whilelist - trusted nonspam
Line 19: Line 72:
* 127.0.0.3 - yellowlist - mix of spam and nonspam
* 127.0.0.3 - yellowlist - mix of spam and nonspam
* 127.0.0.4 - brownlist - all spam - but not yet enough to blacklist
* 127.0.0.4 - brownlist - all spam - but not yet enough to blacklist
 +
* 127.0.0.5 - NOBL - This IP is not a spam only source and no blacklists need to be tested
 +
 +
Like all IP based lists the tuples of the client's IP address are reversed in order and the blacklist name is appendend. So if you were to look up 1.2.3.4 you would query the DNS for the following hostname:
 +
 +
4.3.2.1.hostkarma.junkemailfilter.com
 +
 +
=== Name Based Lookups ===
 +
 +
In addition to IP based lookups the hostkarma list also supports name based lookups. If you wanted to look up wellsfargo.com, you would query the DNS for the following hostname:
 +
 +
wellsfargo.com.hostkarma.junkemailfilter.com
 +
 +
As with the IP lists a 127.0.0.1 is a white listing, 127.0.0.2 is a black listing. The return codes are the same as listed above for IP addresses.
=== List Logic ===
=== List Logic ===
-
The best way to use the lists is to do it in a specific order. First you check the white list and see if it is white. If so you accept the message without further processing. Then you see if the list is yellow. If so - you skip all your blacklist tests. Then you check your blacklists and if listed you bounce it. Whatever email is left is then tested with all your other testing methods like [http://www.spamassassin.org Spam Assassin].
+
The best way to use the lists is to do it in a specific order:
 +
 
 +
First you check the white list and see if it is white. If so you accept the message without further processing. Then you see if the list is yellow. If so - you skip all your blacklist tests. Then you check your blacklists and if listed you bounce it. Whatever email is left is then tested with all your other testing methods like [http://www.spamassassin.org Spam Assassin].
=== Exim Examples ===
=== Exim Examples ===
Line 30: Line 98:
  # Mark it White  
  # Mark it White  
  warn dnslists = hostkarma.junkemailfilter.com=127.0.0.1
  warn dnslists = hostkarma.junkemailfilter.com=127.0.0.1
-
       set acl_c1 = white - dnswl - $sender_fullhost
+
       set acl_c_white = white - dnswl - $sender_fullhost
   
   
  # Mark it Yellow  
  # Mark it Yellow  
  warn dnslists = hostkarma.junkemailfilter.com=127.0.0.3
  warn dnslists = hostkarma.junkemailfilter.com=127.0.0.3
-
       set acl_c1 = yellow - $sender_fullhost
+
       set acl_c_yellow = yellow - $sender_fullhost
   
   
  # Using the Black List
  # Using the Black List
Line 45: Line 113:
       list.dsbl.org : web.dnsbl.sorbs.net : socks.dnsbl.sorbs.net :\
       list.dsbl.org : web.dnsbl.sorbs.net : socks.dnsbl.sorbs.net :\
       http.dnsbl.sorbs.net
       http.dnsbl.sorbs.net
 +
 +
=== Postfix Examples ===
 +
 +
<b>Postfix For Blacklisting:</b>
 +
 +
smtpd_client_restrictions = reject_rbl_client hostkarma.junkemailfilter.com=127.0.0.2
 +
 +
<b>Postfix For Whitelisting and Blacklisting:</b>
 +
 +
smtpd_client_restrictions = permit_dnswl_client hostkarma.junkemailfilter.com=127.0.0.1,
 +
    reject_rbl_client hostkarma.junkemailfilter.com=127.0.0.2
=== Spam Assassin Examples ===
=== Spam Assassin Examples ===
Line 50: Line 129:
[http://www.spamassassin.org Spam Assassin] can access the white and black lists for scoring.
[http://www.spamassassin.org Spam Assassin] can access the white and black lists for scoring.
-
  header __RCVD_IN_JMF eval:check_rbl('JMF-lastexternal','hostkarma.junkemailfilter.com.')
+
  header __RCVD_IN_HOSTKARMA eval:check_rbl('HOSTKARMA-lastexternal','hostkarma.junkemailfilter.com.')
-
  describe __RCVD_IN_JMF Sender listed in JunkEmailFilter
+
  describe __RCVD_IN_HOSTKARMA Sender listed in JunkEmailFilter
-
  tflags __RCVD_IN_JMF net
+
  tflags __RCVD_IN_HOSTKARMA net
    
    
-
  header RCVD_IN_JMF_W eval:check_rbl_sub('JMF-lastexternal', '127.0.0.1')
+
  header RCVD_IN_HOSTKARMA_W eval:check_rbl_sub('HOSTKARMA-lastexternal', '127.0.0.1')
-
  describe RCVD_IN_JMF_W Sender listed in JMF-WHITE
+
  describe RCVD_IN_HOSTKARMA_W Sender listed in HOSTKARMA-WHITE
-
  tflags RCVD_IN_JMF_W net nice
+
  tflags RCVD_IN_HOSTKARMA_W net nice
-
  score RCVD_IN_JMF_W -5
+
  score RCVD_IN_HOSTKARMA_W -5
    
    
-
  header RCVD_IN_JMF_BL eval:check_rbl_sub('JMF-lastexternal', '127.0.0.2')
+
  header RCVD_IN_HOSTKARMA_BL eval:check_rbl_sub('HOSTKARMA-lastexternal', '127.0.0.2')
-
  describe RCVD_IN_JMF_BL Sender listed in JMF-BLACK
+
  describe RCVD_IN_HOSTKARMA_BL Sender listed in HOSTKARMA-BLACK
-
  tflags RCVD_IN_JMF_BL net
+
  tflags RCVD_IN_HOSTKARMA_BL net
-
  score RCVD_IN_JMF_BL 3.0
+
  score RCVD_IN_HOSTKARMA_BL 3.0
    
    
-
  header RCVD_IN_JMF_BR eval:check_rbl_sub('JMF-lastexternal', '127.0.0.4')
+
  header RCVD_IN_HOSTKARMA_BR eval:check_rbl_sub('HOSTKARMA-lastexternal', '127.0.0.4')
-
  describe RCVD_IN_JMF_BR Sender listed in JMF-BROWN
+
  describe RCVD_IN_HOSTKARMA_BR Sender listed in HOSTKARMA-BROWN
-
  tflags RCVD_IN_JMF_BR net
+
  tflags RCVD_IN_HOSTKARMA_BR net
-
  score RCVD_IN_JMF_BR 1.0
+
  score RCVD_IN_HOSTKARMA_BR 1.0
-
== Name Based DNS Lookup ==
+
== Implementing Name Based DNS Lookup ==
The hostkarma DNS list supports name based lookups as well as IP based lookups.  
The hostkarma DNS list supports name based lookups as well as IP based lookups.  
Line 78: Line 157:
* 127.0.0.2 = blacklisted
* 127.0.0.2 = blacklisted
* 127.0.0.3 = yellowlisted
* 127.0.0.3 = yellowlisted
 +
* 127.0.0.4 = URIBL
 +
* 127.0.0.5 = NOBL listed
Example:
Example:
Line 85: Line 166:
  accept dnslists = hostkarma.junkemailfilter.com=127.0.0.1/$sender_host_name
  accept dnslists = hostkarma.junkemailfilter.com=127.0.0.1/$sender_host_name
-
  drop dnslists = hostkarma.junkemailfilter.com=127.0.0.2/$sender_host_name
+
  deny dnslists = hostkarma.junkemailfilter.com=127.0.0.2/$sender_host_name
-
== Overview of the Lists ==
+
Examples using Postfix:
-
Unfortunately these lists are not the only solution to spam. But these lists are designed to be a front end to your spam filtering process allowing you to identify with great accuracy much of your incoming email. These lists have two purposes, one is to catch some spam, but more importantly these lists are used mostly to identify nonspam and to prevent mixed hosts from being blacklisted accidentally but our lists and others. One of the problems with spam filtering is that legitimate senders fail to get their email through because it is miscategorized as spam. These lists help prevent that from happening.
+
reject_rhsbl_sender hostkarma.junkemailfilter.com=127.0.0.2
-
Most spam filtering technology is based on identifying spam, and whatever is left is nonspam. Our method also actively identifies nonspam as well as spam. By actively identifying nonspam it eliminate false positives and shrinks the number of messages that you have to work hard to identify with tools like Spam Assassin. These tools are processor intensive and requires a lot of rules that do very well, but sometimes makes mistakes.
+
== No Blacklist List ==
-
== What Kinds of Spam Does this list Work With? ==
+
We have also created a No blacklist list of IP and host names that are either white listed, yellow listed, or otherwise determined that these IP addresses should never be in any blacklist.
-
The black list catches spam only servers. Generally these include virus infected users who are being used as spam servers. The list is generated by honeypot accounts and spammer's behavior where spam is caught be dong things that only spammers do. We have developed a lot of unique methods of detecting spam based on the behavior of the spammer. We can detect spammers by the way they try to deliver email rather than by the content of the message.
+
The purpose of the list is to avoid false positives. If you are running any kind of DNS list check you can read this list first and if it is listed then you need not test any other blacklists because they will be wrong.
-
The real power here is in the white lists. Those who are used to spam filtering need to think differently about spam processing in order to really get the idea. You have to understand that we are not just looking for spam. This list is to catch nonspam. Nonspam is actually easier in some ways because the nonspam servers aren't doing any tricks to hide. They consistently send out good mail. All we do is track that and once the server establishes a clean reputation we bless it.
+
* 127.0.0.1 = whitelisted - accept as good
 +
* 127.0.0.3 = yellowlisted - mixed source - do not blacklist or whitelist
 +
* 127.0.0.5 = nobl listed - not a spam source - do not blacklist - maybe whitelist
-
We also have ways of detecting nonspam that spammers can't duplicate. We use these methods to build our white lists ensuring that good email gets delivered.
+
Any result from this list means do not blacklist. The list is accessed as follows:
-
== How the System Works ==
+
accept dnslists = nobl.junkemailfilter.com
 +
....
 +
blacklist tests
-
Telling all my tricks would be too long. But central to the system is tracking hosts by collecting data by IP address and doing an analysis on the information to determine the karma of the host.
+
Both name and IP queries to this list are accepted:
-
The idea is that multiple trusted servers feed data to a database that tracks IP addresses and counts the number of spams/nonspams sent by these hosts. A spam increments the spam counter. A nonspam increments the nonspam counter. As the counts go up the servers develop a reputation. Those who spew only spam make the blacklist. Those who spew only good email make the white list. And those who spew a mix make the yellow list.
+
4.3.2.1.nobl.junkemailfilter.com
 +
mydomain.com.nobl.junkemailfilter.com
-
Other technology is also used. Honeypot can blacklist a virus infected server instantly allowing the system to have a very fast response time to new spam servers. The system can also track good servers over a long time tracking good email and establishing a reputation. Much of the blacklist data comes from using fake low and high MX records. When a host hits only the fake high numbered MX records without hitting the low numbered MX records the host is a virus infected spam zombie.
+
== Country Code List ==
-
White and Yellow listing are also done using a table of domain names that are known to only send good email or are know to send mixed email, (yahoo, hotmail). The RDNS is looked up, the host name is verified to see that it matches the name returned, and if the name ends in a host that is on our list then we add the IP address to our white or yellow lists.
+
Junk Email Filter now provides a country code IP lookup. Just used the standard IP lookup (reversed) and read the TXT record and it returns a 2 character country code.
-
We are always looking to expand our white and yellow lists so if you send email and your server send only good email and you want to be on our lists, email me at [mailto:marc@perkel.com marc@perkel.com] with your host name information.
+
4.3.2.1.country.junkemailfilter.com TXT
-
== The Magic is in the White Lists ==
+
Return code of "zz" means the country is unknown.
-
Think differently. It's not just about blocking spam - it's about accepting good email. The real power in this system is the white and yellow lists, not the black list. Envision this. A bank who sends nothing but good email is communicating with tens of thousands of customers on a regular basis. Their email goes to thousands of servers who host the customer's email. So lets say that 30 of these servers are feeding data to the database. After a few months the IP address of the bank's server has 100,000 good emails recorded and say 20 spams (some people will accidently report spam in error). Thus the bank can be whitelisted. Why bother to check email from a host like that for spam?
+
== Experimental Return Codes ==
-
And it's not just banks. It's all institutions that send only good email. No one has to pay a fee to get listed. It's a karma system. You're good reputation gives you a fast pass through the filter.
+
Our lists use a different philosophy than most lists. Instead of making separate calls over and over to separate lists we combine all our information into a single call. The theory is that this is far more efficient to return all the information in a single call reducing bandwidth and increasing speed through reduced number of calls.
-
Some serves send a mixture of spam and nonspam. Example are AOL, Yahoo, Hotmail, Comcast. People who sell email services or ISPs. They try to get rid of spam, but some people exploit them anyway. These are servers that make the yellow list. The messages still need to be spam tested, but because they have a reputation of sending some good email they can at least bypass blacklisting. Thus - if a Comcast customer starts spamming through Comcast servers and Comcast doesn't detect it, this system will at least keep the Comcast server from being blacklisted which would prevent other Comcast customers from having their email blocked.
+
The following are experimental codes that we are using internally. These may not be a list of all the return codes we use and we don't guarantee that we will continue to use these codes. But if we list these codes here it's because we have been using them for a while and finding them somewhat useful. If you want to use this information we would appreciate feedback on anything you find that might be interesting. There are 4 billion possible return codes so I don't think we are ever going to run out. Because we provide a lot of information any software that accesses our lists need to be prepared to receive and parse the multiple return codes. Here's an example of what you might see on a whitelisted domain:
-
== Problems this Service Solves ==
+
dig wellsfargo.com.hostkarma.junkemailfilter.com
 +
 +
;; ANSWER SECTION:
 +
wellsfargo.com.hostkarma.junkemailfilter.com. 2100 IN A 127.0.0.1
 +
wellsfargo.com.hostkarma.junkemailfilter.com. 2100 IN A 127.0.1.1
 +
wellsfargo.com.hostkarma.junkemailfilter.com. 2100 IN A 127.0.2.3
-
One famous controversy over spam filtering is the battle between [http://www.dearaol.com AOL/Goodmail] vs. the [http://www.eff.org Electronic Frontier Foundation]. In this case both sides are wrong with EFF being a little more wrong than AOL. The Goodmail/AOL relationship is based on the idea that Goodmaill certifies email as good and AOL accepts it as good email. But there's $$$ involved and because of this EFF has accused AOL as trying to turn email into a paid service. Unfortunately EFF can't get beyond listening to themselves echo their own opinion to understand that the concepts behind AOL/Goodmail are at least partially sound. The idea is to get the good email through.
+
The results indicate that the domain wellsfargo.com is whitelisted (127.0.0.1), uses QUIT (127.0.1.1), and is familiar to us for over a week (127.0.2.3).
-
This system eliminates the need for AOL/Goodmail's system in that it automatically tracks good email from all servers and makes their karma available to the world. So rather than having to pay to get a reputation as a trusted server all you have to do is consistently send good email and when the world sees that then you get whitelisted. Problem solved.
+
=== Tracking use of QUIT ===
-
== Can Spammers Out Smart This System? ==
+
Usually virus infected spam bots don't close the connection using the QUIT command. That's because the message is already sent and the spam bot isn't going to hang around and be polite and close the connection. This by itself is not sufficient to indicate a spam bot but it is a very important piece that when combined with other behaviors make spam bot detection both easy and accurate. We track both the host name and IP addresses so you cam use hostkarma to look up either one. The codes are as follows:
-
The short answer is yes - probably some can. However it represents yet another significan hurdle for them to cross. In reality this system will block mostly easy to detect spam sources. But - that's not where the power lies. it doesn't matter if spammers out smart this system. What this system does is protect good email from being falsely identified as spam and blocked. This isn't a spam filter as much as a ham filter. The power is in identifying good email.
+
* 127.0.1.1 - QUIT is used
 +
* 127.0.1.2 - No QUIT is used
 +
* 127.0.1.3 - Mixed - Quit is used sometimes
-
To block spam you would just use this as a front ent to your system to preclassify the easy spam/ham and them pass the rest on to meaner tests. A spammer might be able to fake their way from being blacklisted to yellowlisted. But not all the way to whitelisted.
+
We do have some mutual exclusion logic and do some counting and other refinements to improve the data. As this is experimental we are not ready to document further details. As with our lists data can be tested as follows:
-
== This Service is under Development ==
+
4.3.2.1.hostkarma.junkemailfilter.com
 +
example.com.hostkarma.junkemailfilter.com
-
What kind of accuracy can you expect using thse lists? At the moment the black list isn't as accurate as it should be in part because we need more data. For example, we are located in the US and we get a lot of spam from outside the US. Some of the servers send us only spam, but if we were in that country then we would see nonspam as well. Thus with our limited data it would create a false positive.
+
=== Familiar Domains ===
-
The power however is in the white lists and they should be more accurate. These lists can be used to bypass spam filtering for nonspam and increase accuracy and decrease load. And the yellow lists can be used to avoid false positives in other black lists.
+
Spammers often register new domain names and use them for spam. Most commonly they are used as links to sites that the spam wants you to click on. Many of these sites are fraud sides pretending to be banks so that they can get your account information and steal your money. But there is no easy way to get a list of new domains. Several people have tried but by the time they process the data the domains have been in operation for some time.  
-
== The Future of the Concept - The Big Picture ==
+
So instead of listing new domains what we are trying to do is list old domains in what we call our familiar list. The idea being that if the domain isn't listed then it is unfamiliar and thus new domains can be detected instantly upon being used. Of course this detects domains that are familiar to us so if an old domain contacts our servers for the first time they are also unfamiliar. So although we can detect 100% of new domains, not all domains detect as new are actually new. They are just new to us.
-
This system can make a huge difference in the accuracy of spam detection for the entire planet. Every email server on the planet - if it were scaled up - could access these lists and eliminate some 50%  of all spam and identify some 95% of all nonspam with 100% accuracy with extremely little effort. To do it right would take several major partners getting involved and better programmers than me to do it right.  
+
So keeping this in mind being unfamiliar isn't anything you would want to use for blocking but rather as one piece of information that when combined with other sins indicates that the unfamiliar domain is being used for fraud. We also track how long the domain is known to us so that creates an age indication that might be useful.
-
Here's what it would take:
+
* 127.0.2.1 - domains we first saw in the last 24-48 hours
 +
* 127.0.2.2 - domains we first saw in the last 10 days
 +
* 127.0.2.3 - domains that are older than 10 days
-
You would have a central (replicated) [http://www.mysql.org MySQL] cluster that is big and hardened and secret and immune from DOS attacks. This is where the data for the lists are kept. If done right it might run on less that $10,000 worth of hardware.
+
And, of course, if not listed then the domain is totally unfamiliar to us. Domains are read by reading the hostkarma list as follows:
-
As a front end to this are a number of [http://mydns.bboy.net MyDNS] servers and caching front end servers that connect to the databases on the back end and providing a front end for email servers all over the world to access. It would also take some smart people and many servers running Spam Assassin to check the quality of the lists and verifying that the lists remain accurate. And it might take a few people to watch over it to make sure there aren't any problem and some programmers to adapt to spammers who will always try to beat the system.
+
example.com.hostkarma.junkemailfilter.com
-
This isn't going to solve the spam problem. But if done right it will significantly reduce the false positive problem allowing for far greater front end accuracy. This will greatly reduce system load and make the remaining email easier to process.
+
== Data Life ==
-
== Privacy Issues ==
+
Blacklist data lives about 5 days so if you are wrongly blacklisted or if you had a virus and fixed he problem you will automatically be delisted 5 days after the spamming stops. White list data lives about 10 days. The exceptions being those who are permanently white listed or black listed.
-
This system is totally privacy friendly. It does not requite any kind of personal information or the sharing of message content or header content. It merely keeps totals of the karma of the IP of the sending host. So personal liberty is preserved. This system is liberty friendly and help ensure the delivery of the email you want to send and the email you want to get.
+
== Blacklist Testing and other testing Tools ==
-
== Joining In - Helping Development ==
+
[[http://multirbl.valli.org/ Valli black list testing tool]]
 +
[[http://www.dnsstuff.com/ DNS Stuff]]
 +
[[http://mailboxtools.com/ Mailbox Tools]]
 +
[[http://dnssy.com/index.php DNSay]]
-
I need some help with this. If you are the person in charge of a large email system, preferably running Exim and Spam Assassin, and you are technically sharp, I can use some help making this system better through testing and development. it is also a system where the more data I have the more accurate and comprehensive the lists will be. So if you like what you are reading here then join in and let's make it happen.
+
== Blacklist Compared ==
-
First - use the lists. Add the above code to your ACLs and set it up to use the white, yellow, and black lists. Once you are comfortable with that then contact me and I'll set you up with access so that you can submit your data to the counters so that I can incorporate your information into the system.
+
How does our lists compare to other lists. Here's some web sites where lists are compared.
-
The data you send will not violate anyone's privacy. I just need the IP address of the server and if it is spam or nonspam. The code is fairly simple.
+
[[http://www.intra2net.com/en/support/antispam/index.php Intra2net]]
 +
[[http://www.spamcannibal.org/dnsbl_compare.shtml Spam Cannible]]
-
My email address is [mailto:marc@perkel.com marc@perkel.com] and my spam filtering is so good that I don't have to hide it.
+
== You can help us help you by building our list ==
-
== Feeding Us Data ==
+
If you want to participate in helping to build our lists and further reduce your spam you can participate in our [[project tarbaby]]. This will give you a little free spam reduction and allow us to harvest some spam bot data to help build our lists. It involves setting your highest MX to point to our Tarbaby server.
-
We are looking for some good data feeds to help expand the list and improve accuracy. Feeding data involves running a simple shell scritp that basically sends a string to a port to report that an IP has sent spam or ham. The script looks like this:
+
Just add this as your highest numbered MX record.
-
  #!/bin/bash
+
  tarbaby.junkemailfilter.com
-
# ip-report script - GPL by Marc Perkel - 2006
+
-
#
+
-
# Usage: ip-report message ip_address
+
-
# Examples: - ip-report spam 1.2.3.4
+
-
#            ip-report ham  5.6.7.8
+
-
#
+
-
# Email me for the host and port info
+
-
#
+
-
# Runs netcat to send a string to a port on a host
+
-
+
-
echo "$*" | nc -w 4 host port
+
-
The idea here is to just submit IPs that you have a very high confidence are spam or nonspam. These submissions go into a MySQL database and every 5 minutes the live lists ae modified to reflect the statistical data. So alterations are live. A nonspam submission will instantly yellow list the IP and remove it from the black list.
+
That will help us build our list, reduce your spam, and help tune the list to those spamming you making our black list even more effective for you.
-
What I'm looking for are people who process a LOT of email, who are innovative, and who can get excited about this concept and feed a lot of good data. I'm also interested in feedback and ways to improve the system.
+
== What Kinds of Spam Does this list Work With? ==
-
The system now supports 4 kinds of feedback.
+
The black list catches spam only servers. Generally these include virus infected users who are being used as spam servers. The list is generated by honeypot accounts and spammer's behavior where spam is caught be dong things that only spammers do. We have developed a lot of unique methods of detecting spam based on the behavior of the spammer. We can detect spammers by the way they try to deliver email rather than by the content of the message.
-
spam    - messages that are almost certianly spam
+
The real power here is in the white lists. Those who are used to spam filtering need to think differently about spam processing in order to really get the idea. You have to understand that we are not just looking for spam. This list is to catch nonspam. Nonspam is actually easier in some ways because the nonspam servers aren't doing any tricks to hide. They consistently send out good mail. All we do is track that and once the server establishes a clean reputation we bless it.
-
ham      - messages that are almost certianly not spam
+
-
nonspam - messages that you think are probably not spam 
+
-
lowspam  - messages that you think are probably spam
+
-
For those of you familiar with Spam Assassin, spam would be a message scoring 15  points and ready for bayesian autolearn. Lowsam would be a meaage from 5-15 points. Nonspam would be in the range of -2 - 5 points. And ham would be below -2 points and ready to autolearn.
+
We also have ways of detecting nonspam that spammers can't duplicate. We use these methods to build our white lists ensuring that good email gets delivered.
-
A sample Exim ACL to report spam might look like this: (untested)
+
== How the System Works ==
-
warn spam=nobody
+
Telling all my tricks would be too long. But central to the system is tracking hosts by collecting data by IP address and doing an analysis on the information to determine the karma of the host.
-
      set acl_c5 = $spam_score_int
+
 
-
+
The idea is that multiple trusted servers feed data to a database that tracks IP addresses and counts the number of spams/nonspams sent by these hosts. A spam increments the spam counter. A nonspam increments the nonspam counter. As the counts go up the servers develop a reputation. Those who spew only spam make the blacklist. Those who spew only good email make the white list. And those who spew a mix make the yellow list.
-
warn condition = ${if >{$acl_c5}{150}{yes}{no}}
+
 
-
      condition = ${run {/etc/exim/ip-report spam $sender_host_address}{yes}{yes}}
+
Other technology is also used. Honeypot can blacklist a virus infected server instantly allowing the system to have a very fast response time to new spam servers. The system can also track good servers over a long time tracking good email and establishing a reputation. Much of the blacklist data comes from using fake low and high MX records. When a host hits only the fake high numbered MX records without hitting the low numbered MX records the host is a virus infected spam zombie.
-
+
 
-
warn spam = nobody
+
White and Yellow listing are also done using a table of domain names that are known to only send good email or are know to send mixed email, (yahoo, hotmail). The RDNS is looked up, the host name is verified to see that it matches the name returned, and if the name ends in a host that is on our list then we add the IP address to our white or yellow lists.
-
      condition = ${if <{$acl_c5}{-20}{yes}{no}}
+
 
-
      condition = ${run {/etc/exim/ip-report ham $sender_host_address}{yes}{yes}}
+
We are always looking to expand our white and yellow lists so if you send email and your server send only good email and you want to be on our lists, email me at [mailto:marc@perkel.com marc@perkel.com] with your host name information.
 +
 
 +
== The Magic is in the White Lists ==
 +
 
 +
Think differently. It's not just about blocking spam - it's about accepting good email. The real power in this system is the white and yellow lists, not the black list. Envision this. A bank who sends nothing but good email is communicating with tens of thousands of customers on a regular basis. Their email goes to thousands of servers who host the customer's email. So lets say that 30 of these servers are feeding data to the database. After a few months the IP address of the bank's server has 100,000 good emails recorded and say 20 spams (some people will accidently report spam in error). Thus the bank can be whitelisted. Why bother to check email from a host like that for spam?
 +
 
 +
And it's not just banks. It's all institutions that send only good email. No one has to pay a fee to get listed. It's a karma system. You're good reputation gives you a fast pass through the filter.
 +
 
 +
Some serves send a mixture of spam and nonspam. Example are AOL, Yahoo, Hotmail, Comcast. People who sell email services or ISPs. They try to get rid of spam, but some people exploit them anyway. These are servers that make the yellow list. The messages still need to be spam tested, but because they have a reputation of sending some good email they can at least bypass blacklisting. Thus - if a Comcast customer starts spamming through Comcast servers and Comcast doesn't detect it, this system will at least keep the Comcast server from being blacklisted which would prevent other Comcast customers from having their email blocked.
 +
 
 +
== Can Spammers Out Smart This System? ==
 +
 
 +
The short answer is yes - probably some can. However it represents yet another significant hurdle for them to cross. In reality this system will block mostly easy to detect spam sources. But - that's not where the power lies. it doesn't matter if spammers out smart this system. What this system does is protect good email from being falsely identified as spam and blocked. This isn't a spam filter as much as a ham filter. The power is in identifying good email.
-
Note that $spam_score_int is 10 times what the spam score is.
+
To block spam you would just use this as a frontend to your system to preclassify the easy spam/ham and them pass the rest on to meaner tests. A spammer might be able to fake their way from being blacklisted to yellowlisted. But not all the way to whitelisted.

Revision as of 00:15, 3 January 2016

Contents

Were you Blacklisted and want to be removed?

Have you been blacklisted on our Hostkarma list? To check or be removed Click Here.

Creating White/Yellow/Black DNS lists for email systems in the fight against spam.

Free DNS host karma listing servers to provide information to the world about what servers are sending spam, nonspam, or a mix of spam and nonspam. This is a service of Junk Email Filter dot com. One of many technologies used in advanced email filtering.

LICENSE - Using these Lists is Free

Unless you really load our servers and suck a lot of bandwidth use of these lists are almost free.

  • If you are a non-profit organization usage of this list is free. In fact, if you are a progressive nonprofit you might qualify for free spam filtering service as our way of helping to support progressive causes. (We determine what we consider progressive)
  • If you are a small business you can use it for free. However we ask as a favor for using it for free that you thank us somewhere on your web site. (Not a requirement) Link to http://www.junkemailfilter.com.

List Attitude

Different lists have different criteria for listing that to a large extent reflects the personality of the people behind the list. Some lists are angry lists where they list everything and if you got on their list it's your fault. There are also lists that have nothing to do with spam, but try to punish behavior that they don't like, or try to promote technologies that do not work.

This list is not an angry list. We focus on the reality of what really works.

Our position is that if you are a spammer we want to block you. If you are not a spammer we want to make sure your email gets delivered. And if you have been hacked or have a virus we want to help you get back to normal and get you off our blacklist as quickly as we can. If your server is misconfigured, we want to help you get it right so that your good email can be delivered as efficiently as possible. And if you never send spam we want you to be on our whitelist. To us it's all about delivering good email and blocking bad email. Our mission is to get it right and to be professional and friendly about it. And because there is so much spam out there, we want to partner with our competitors so that we can all keep our customers happy and the spammers unhappy.

Our system also sends out automated notices to alert spam sources of problems and to get feedback in case we have a problem rejecting good email so that we can fix problems that we don't know about. This helps ISPs and office network admins find and shut down virus infected computer reducing spam across the planet. Our view is that the best way to fight spam is to stop it at the source.

Junk Email Filter uses innovative techniques to fight spam. We have been the leader in introducing several new spam fighting technologies. This list is an example of our commitment not just to be accurate but to be efficient. Most lists are just black lists. A few have multiple return codes as to why the IP is blacklisted. There are also a for white lists but most of those white lists are really lists of servers not to blacklist. Our lists go much farther.

  1. Besides just black lists and white lists we have yellow lists and NOBL lists. (Our Invention) White on our lists means that anything that comes from the source is good email and needs no further testing. NOBL is like most other's white lists but means this IP or host name should not be black listed. So no need to check the black lists. Yellow listing indicates a mixed source of good email and spam. Sources like Hotmail, Yahoo, and Gmail are yellow sources. Yellow means that the IP address or host name contains no information about if it is good or bad and no reason to check white or black lists.
  2. Instead of having a lot of separate lists which would require multiple DNS lookups we support a single DNS look up and we return different codes as to the status of the IP. In some cases we return multiple codes indicating the IP meets multiple conditions. At this time we are the only DNS list that does this. However we can not ignore the efficiency of a single call lookup and we think that this is a model for the future.
  3. Forward Confirmed reverse DNS (FCrDNS) lookup is almost impossible to spoof. We therefore think there is an opportunity to provide host name lookup based on FCrDNS not just for blacklisted names, but for all the other colors to and like the IP lookups we return multiple results in a single DNS call that indicates everything we have that is useful information in a single call.
  4. Unlike most lists and spam filtering systems who focus on black lists, we focus on white lists as well as NOBL and Yellow lists to actively detect good email and protect good email from being misclassified. It's not just a matter of catching spam and letting everything else pass. We actively detect good email and pass it through as quickly and efficiently as possible. This allows us to pass good email faster by avoiding unnecessary spam tests on email we can easily determine is good. This is a philosophy we have worked hard to instill in the spam fighting community.

Retaliatory Listings

We do everything possible to make sure that legitimate email is not blocked by our lists and we expect those who run lists to do the same. If any other list knowingly blacklists our IP space or fails to remove blacklists against us after being informed of their error we reserve the right to blacklist your IP space.

Types of Listings

Our listings are a little different than some RBLs. Instead of having you make several DNS calls to test for white or black listing we return it in one call and we have different result codes based on the reputation of the host name or IP being looked up. We also provide other general information that isn't black or white but might be useful in determining if something is good or not in combination with other tests. Here's some features of our lists.

  • Black Lists - As with other lists we black list IPs, and host names!
  • White Listings - We list good IPs as well as bad
  • Yellow Listing - For Google, Outlook, Hotmail, Yahoo - IP contains no useful information.
  • Quit / NotQuit - Do they close the connection Properly?
  • Name Based Lookups - Not just IPs but host names too!
  • Domain Age - If the host name is familiar or not. Can be used to catch spammers using newly registered domains
  • TLDs - test for legit Top Level Domains
  • Registry Barriers - Test to see where the main domain stops and the subdomain starts.
  • Country Codes - Look up what country the IP came from.

How to use the Lists

Junk Email Filter dot com provides several public lists -- one is a black list to block spam and the other is a white list to either pass nonspam/ham or to keep sites from being blocked. Blocking is done by IP address which is something spammers can't spoof. We look at email hosts as being one of these kinds:

  • hosts that generate only spam that we blacklist
  • hosts that generate a mix which we yellow list.
  • hosts that generate only nonspam which we whitelist

Our list server is hostkarma.junkemailfilter.com - this server returns several different results depending on what kind of listing it is. If the server returns 127.0.0.1 then it is whitelisted. You can accept the email without any further checking.

If the result is 127.0.0.3 then the host is yellow listed. Yellow listing means that host generates some spam and some nonspam (examples: yahoo.com, hotmail.com). What that means is that this host should never be blacklisted and that other IP based blacklists should be bypassed to prevent false positives.

If the result is 127.0.0.2 it is blacklisted - if the IP is listed here you can bounce it without further checking.

And if the result is 127.0.0.4 it is brownlisted which means it is on its way to being blacklisted but hasn't quite got there yet. But it might be worth a few points using SpamAssassin.

  • 127.0.0.1 - whilelist - trusted nonspam
  • 127.0.0.2 - blacklist - block spam
  • 127.0.0.3 - yellowlist - mix of spam and nonspam
  • 127.0.0.4 - brownlist - all spam - but not yet enough to blacklist
  • 127.0.0.5 - NOBL - This IP is not a spam only source and no blacklists need to be tested

Like all IP based lists the tuples of the client's IP address are reversed in order and the blacklist name is appendend. So if you were to look up 1.2.3.4 you would query the DNS for the following hostname:

4.3.2.1.hostkarma.junkemailfilter.com

Name Based Lookups

In addition to IP based lookups the hostkarma list also supports name based lookups. If you wanted to look up wellsfargo.com, you would query the DNS for the following hostname:

wellsfargo.com.hostkarma.junkemailfilter.com

As with the IP lists a 127.0.0.1 is a white listing, 127.0.0.2 is a black listing. The return codes are the same as listed above for IP addresses.

List Logic

The best way to use the lists is to do it in a specific order:

First you check the white list and see if it is white. If so you accept the message without further processing. Then you see if the list is yellow. If so - you skip all your blacklist tests. Then you check your blacklists and if listed you bounce it. Whatever email is left is then tested with all your other testing methods like Spam Assassin.

Exim Examples

Exim is an extremely powerful MTA, probably the most powerful MTA on the planet. That's why I like it so much. I want to do what I want to do and Exim allows me to do it.

# Mark it White 
warn dnslists = hostkarma.junkemailfilter.com=127.0.0.1
     set acl_c_white = white - dnswl - $sender_fullhost

# Mark it Yellow 
warn dnslists = hostkarma.junkemailfilter.com=127.0.0.3
     set acl_c_yellow = yellow - $sender_fullhost

# Using the Black List
deny dnslists = hostkarma.junkemailfilter.com=127.0.0.2

# Other Blacklists
deny !dnslists = hostkarma.junkemailfilter.com=127.0.0.1,127.0.0.3
     dnslists = zen.spamhaus.org/<;$sender_host_address;$sender_address_domain :\
     nomail.rhsbl.sorbs.net/$sender_address_domain : cbl.abuseat.org :\ 
     list.dsbl.org : web.dnsbl.sorbs.net : socks.dnsbl.sorbs.net :\
     http.dnsbl.sorbs.net

Postfix Examples

Postfix For Blacklisting:

smtpd_client_restrictions = reject_rbl_client hostkarma.junkemailfilter.com=127.0.0.2 

Postfix For Whitelisting and Blacklisting:

smtpd_client_restrictions = permit_dnswl_client hostkarma.junkemailfilter.com=127.0.0.1,
    reject_rbl_client hostkarma.junkemailfilter.com=127.0.0.2

Spam Assassin Examples

Spam Assassin can access the white and black lists for scoring.

header __RCVD_IN_HOSTKARMA eval:check_rbl('HOSTKARMA-lastexternal','hostkarma.junkemailfilter.com.')
describe __RCVD_IN_HOSTKARMA Sender listed in JunkEmailFilter
tflags __RCVD_IN_HOSTKARMA net
 
header RCVD_IN_HOSTKARMA_W eval:check_rbl_sub('HOSTKARMA-lastexternal', '127.0.0.1')
describe RCVD_IN_HOSTKARMA_W Sender listed in HOSTKARMA-WHITE
tflags RCVD_IN_HOSTKARMA_W net nice
score RCVD_IN_HOSTKARMA_W -5
 
header RCVD_IN_HOSTKARMA_BL eval:check_rbl_sub('HOSTKARMA-lastexternal', '127.0.0.2')
describe RCVD_IN_HOSTKARMA_BL Sender listed in HOSTKARMA-BLACK
tflags RCVD_IN_HOSTKARMA_BL net
score RCVD_IN_HOSTKARMA_BL 3.0
 
header RCVD_IN_HOSTKARMA_BR eval:check_rbl_sub('HOSTKARMA-lastexternal', '127.0.0.4')
describe RCVD_IN_HOSTKARMA_BR Sender listed in HOSTKARMA-BROWN
tflags RCVD_IN_HOSTKARMA_BR net
score RCVD_IN_HOSTKARMA_BR 1.0

Implementing Name Based DNS Lookup

The hostkarma DNS list supports name based lookups as well as IP based lookups.

<hostname>.hostkarma.junkemailfilter.com

  • 127.0.0.1 = whitelisted
  • 127.0.0.2 = blacklisted
  • 127.0.0.3 = yellowlisted
  • 127.0.0.4 = URIBL
  • 127.0.0.5 = NOBL listed

Example: dig hermes.apache.org.hostkarma.junkemailfilter.com

Examples using Exim:

accept	dnslists = hostkarma.junkemailfilter.com=127.0.0.1/$sender_host_name
deny	dnslists = hostkarma.junkemailfilter.com=127.0.0.2/$sender_host_name

Examples using Postfix:

reject_rhsbl_sender hostkarma.junkemailfilter.com=127.0.0.2

No Blacklist List

We have also created a No blacklist list of IP and host names that are either white listed, yellow listed, or otherwise determined that these IP addresses should never be in any blacklist.

The purpose of the list is to avoid false positives. If you are running any kind of DNS list check you can read this list first and if it is listed then you need not test any other blacklists because they will be wrong.

  • 127.0.0.1 = whitelisted - accept as good
  • 127.0.0.3 = yellowlisted - mixed source - do not blacklist or whitelist
  • 127.0.0.5 = nobl listed - not a spam source - do not blacklist - maybe whitelist

Any result from this list means do not blacklist. The list is accessed as follows:

accept	dnslists = nobl.junkemailfilter.com
....
blacklist tests

Both name and IP queries to this list are accepted:

4.3.2.1.nobl.junkemailfilter.com
mydomain.com.nobl.junkemailfilter.com

Country Code List

Junk Email Filter now provides a country code IP lookup. Just used the standard IP lookup (reversed) and read the TXT record and it returns a 2 character country code.

4.3.2.1.country.junkemailfilter.com TXT

Return code of "zz" means the country is unknown.

Experimental Return Codes

Our lists use a different philosophy than most lists. Instead of making separate calls over and over to separate lists we combine all our information into a single call. The theory is that this is far more efficient to return all the information in a single call reducing bandwidth and increasing speed through reduced number of calls.

The following are experimental codes that we are using internally. These may not be a list of all the return codes we use and we don't guarantee that we will continue to use these codes. But if we list these codes here it's because we have been using them for a while and finding them somewhat useful. If you want to use this information we would appreciate feedback on anything you find that might be interesting. There are 4 billion possible return codes so I don't think we are ever going to run out. Because we provide a lot of information any software that accesses our lists need to be prepared to receive and parse the multiple return codes. Here's an example of what you might see on a whitelisted domain:

dig wellsfargo.com.hostkarma.junkemailfilter.com

;; ANSWER SECTION:
wellsfargo.com.hostkarma.junkemailfilter.com. 2100 IN A 127.0.0.1
wellsfargo.com.hostkarma.junkemailfilter.com. 2100 IN A 127.0.1.1
wellsfargo.com.hostkarma.junkemailfilter.com. 2100 IN A 127.0.2.3

The results indicate that the domain wellsfargo.com is whitelisted (127.0.0.1), uses QUIT (127.0.1.1), and is familiar to us for over a week (127.0.2.3).

Tracking use of QUIT

Usually virus infected spam bots don't close the connection using the QUIT command. That's because the message is already sent and the spam bot isn't going to hang around and be polite and close the connection. This by itself is not sufficient to indicate a spam bot but it is a very important piece that when combined with other behaviors make spam bot detection both easy and accurate. We track both the host name and IP addresses so you cam use hostkarma to look up either one. The codes are as follows:

  • 127.0.1.1 - QUIT is used
  • 127.0.1.2 - No QUIT is used
  • 127.0.1.3 - Mixed - Quit is used sometimes

We do have some mutual exclusion logic and do some counting and other refinements to improve the data. As this is experimental we are not ready to document further details. As with our lists data can be tested as follows:

4.3.2.1.hostkarma.junkemailfilter.com
example.com.hostkarma.junkemailfilter.com

Familiar Domains

Spammers often register new domain names and use them for spam. Most commonly they are used as links to sites that the spam wants you to click on. Many of these sites are fraud sides pretending to be banks so that they can get your account information and steal your money. But there is no easy way to get a list of new domains. Several people have tried but by the time they process the data the domains have been in operation for some time.

So instead of listing new domains what we are trying to do is list old domains in what we call our familiar list. The idea being that if the domain isn't listed then it is unfamiliar and thus new domains can be detected instantly upon being used. Of course this detects domains that are familiar to us so if an old domain contacts our servers for the first time they are also unfamiliar. So although we can detect 100% of new domains, not all domains detect as new are actually new. They are just new to us.

So keeping this in mind being unfamiliar isn't anything you would want to use for blocking but rather as one piece of information that when combined with other sins indicates that the unfamiliar domain is being used for fraud. We also track how long the domain is known to us so that creates an age indication that might be useful.

  • 127.0.2.1 - domains we first saw in the last 24-48 hours
  • 127.0.2.2 - domains we first saw in the last 10 days
  • 127.0.2.3 - domains that are older than 10 days

And, of course, if not listed then the domain is totally unfamiliar to us. Domains are read by reading the hostkarma list as follows:

example.com.hostkarma.junkemailfilter.com

Data Life

Blacklist data lives about 5 days so if you are wrongly blacklisted or if you had a virus and fixed he problem you will automatically be delisted 5 days after the spamming stops. White list data lives about 10 days. The exceptions being those who are permanently white listed or black listed.

Blacklist Testing and other testing Tools

[Valli black list testing tool]
[DNS Stuff]
[Mailbox Tools]
[DNSay]

Blacklist Compared

How does our lists compare to other lists. Here's some web sites where lists are compared.

[Intra2net] 
[Spam Cannible]

You can help us help you by building our list

If you want to participate in helping to build our lists and further reduce your spam you can participate in our project tarbaby. This will give you a little free spam reduction and allow us to harvest some spam bot data to help build our lists. It involves setting your highest MX to point to our Tarbaby server.

Just add this as your highest numbered MX record.

tarbaby.junkemailfilter.com

That will help us build our list, reduce your spam, and help tune the list to those spamming you making our black list even more effective for you.

What Kinds of Spam Does this list Work With?

The black list catches spam only servers. Generally these include virus infected users who are being used as spam servers. The list is generated by honeypot accounts and spammer's behavior where spam is caught be dong things that only spammers do. We have developed a lot of unique methods of detecting spam based on the behavior of the spammer. We can detect spammers by the way they try to deliver email rather than by the content of the message.

The real power here is in the white lists. Those who are used to spam filtering need to think differently about spam processing in order to really get the idea. You have to understand that we are not just looking for spam. This list is to catch nonspam. Nonspam is actually easier in some ways because the nonspam servers aren't doing any tricks to hide. They consistently send out good mail. All we do is track that and once the server establishes a clean reputation we bless it.

We also have ways of detecting nonspam that spammers can't duplicate. We use these methods to build our white lists ensuring that good email gets delivered.

How the System Works

Telling all my tricks would be too long. But central to the system is tracking hosts by collecting data by IP address and doing an analysis on the information to determine the karma of the host.

The idea is that multiple trusted servers feed data to a database that tracks IP addresses and counts the number of spams/nonspams sent by these hosts. A spam increments the spam counter. A nonspam increments the nonspam counter. As the counts go up the servers develop a reputation. Those who spew only spam make the blacklist. Those who spew only good email make the white list. And those who spew a mix make the yellow list.

Other technology is also used. Honeypot can blacklist a virus infected server instantly allowing the system to have a very fast response time to new spam servers. The system can also track good servers over a long time tracking good email and establishing a reputation. Much of the blacklist data comes from using fake low and high MX records. When a host hits only the fake high numbered MX records without hitting the low numbered MX records the host is a virus infected spam zombie.

White and Yellow listing are also done using a table of domain names that are known to only send good email or are know to send mixed email, (yahoo, hotmail). The RDNS is looked up, the host name is verified to see that it matches the name returned, and if the name ends in a host that is on our list then we add the IP address to our white or yellow lists.

We are always looking to expand our white and yellow lists so if you send email and your server send only good email and you want to be on our lists, email me at marc@perkel.com with your host name information.

The Magic is in the White Lists

Think differently. It's not just about blocking spam - it's about accepting good email. The real power in this system is the white and yellow lists, not the black list. Envision this. A bank who sends nothing but good email is communicating with tens of thousands of customers on a regular basis. Their email goes to thousands of servers who host the customer's email. So lets say that 30 of these servers are feeding data to the database. After a few months the IP address of the bank's server has 100,000 good emails recorded and say 20 spams (some people will accidently report spam in error). Thus the bank can be whitelisted. Why bother to check email from a host like that for spam?

And it's not just banks. It's all institutions that send only good email. No one has to pay a fee to get listed. It's a karma system. You're good reputation gives you a fast pass through the filter.

Some serves send a mixture of spam and nonspam. Example are AOL, Yahoo, Hotmail, Comcast. People who sell email services or ISPs. They try to get rid of spam, but some people exploit them anyway. These are servers that make the yellow list. The messages still need to be spam tested, but because they have a reputation of sending some good email they can at least bypass blacklisting. Thus - if a Comcast customer starts spamming through Comcast servers and Comcast doesn't detect it, this system will at least keep the Comcast server from being blacklisted which would prevent other Comcast customers from having their email blocked.

Can Spammers Out Smart This System?

The short answer is yes - probably some can. However it represents yet another significant hurdle for them to cross. In reality this system will block mostly easy to detect spam sources. But - that's not where the power lies. it doesn't matter if spammers out smart this system. What this system does is protect good email from being falsely identified as spam and blocked. This isn't a spam filter as much as a ham filter. The power is in identifying good email.

To block spam you would just use this as a frontend to your system to preclassify the easy spam/ham and them pass the rest on to meaner tests. A spammer might be able to fake their way from being blacklisted to yellowlisted. But not all the way to whitelisted.

Personal tools