pax_global_header00006660000000000000000000000064146537363500014526gustar00rootroot0000000000000052 comment=110e93cc6789d44d27e906ef0b97b47b0b79678b nbkenichi-bsfilter-f0a5a7c/000077500000000000000000000000001465373635000157605ustar00rootroot00000000000000nbkenichi-bsfilter-f0a5a7c/LICENSE000066400000000000000000000432541465373635000167750ustar00rootroot00000000000000 GNU GENERAL PUBLIC LICENSE Version 2, June 1991 Copyright (C) 1989, 1991 Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Preamble The licenses for most software are designed to take away your freedom to share and change it. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change free software--to make sure the software is free for all its users. This General Public License applies to most of the Free Software Foundation's software and to any other program whose authors commit to using it. (Some other Free Software Foundation software is covered by the GNU Lesser General Public License instead.) You can apply it to your programs, too. When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for this service if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs; and that you know you can do these things. To protect your rights, we need to make restrictions that forbid anyone to deny you these rights or to ask you to surrender the rights. These restrictions translate to certain responsibilities for you if you distribute copies of the software, or if you modify it. For example, if you distribute copies of such a program, whether gratis or for a fee, you must give the recipients all the rights that you have. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights. We protect your rights with two steps: (1) copyright the software, and (2) offer you this license which gives you legal permission to copy, distribute and/or modify the software. Also, for each author's protection and ours, we want to make certain that everyone understands that there is no warranty for this free software. If the software is modified by someone else and passed on, we want its recipients to know that what they have is not the original, so that any problems introduced by others will not reflect on the original authors' reputations. Finally, any free program is threatened constantly by software patents. We wish to avoid the danger that redistributors of a free program will individually obtain patent licenses, in effect making the program proprietary. To prevent this, we have made it clear that any patent must be licensed for everyone's free use or not licensed at all. The precise terms and conditions for copying, distribution and modification follow. GNU GENERAL PUBLIC LICENSE TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 0. This License applies to any program or other work which contains a notice placed by the copyright holder saying it may be distributed under the terms of this General Public License. The "Program", below, refers to any such program or work, and a "work based on the Program" means either the Program or any derivative work under copyright law: that is to say, a work containing the Program or a portion of it, either verbatim or with modifications and/or translated into another language. (Hereinafter, translation is included without limitation in the term "modification".) Each licensee is addressed as "you". Activities other than copying, distribution and modification are not covered by this License; they are outside its scope. The act of running the Program is not restricted, and the output from the Program is covered only if its contents constitute a work based on the Program (independent of having been made by running the Program). Whether that is true depends on what the Program does. 1. You may copy and distribute verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice and disclaimer of warranty; keep intact all the notices that refer to this License and to the absence of any warranty; and give any other recipients of the Program a copy of this License along with the Program. You may charge a fee for the physical act of transferring a copy, and you may at your option offer warranty protection in exchange for a fee. 2. You may modify your copy or copies of the Program or any portion of it, thus forming a work based on the Program, and copy and distribute such modifications or work under the terms of Section 1 above, provided that you also meet all of these conditions: a) You must cause the modified files to carry prominent notices stating that you changed the files and the date of any change. b) You must cause any work that you distribute or publish, that in whole or in part contains or is derived from the Program or any part thereof, to be licensed as a whole at no charge to all third parties under the terms of this License. c) If the modified program normally reads commands interactively when run, you must cause it, when started running for such interactive use in the most ordinary way, to print or display an announcement including an appropriate copyright notice and a notice that there is no warranty (or else, saying that you provide a warranty) and that users may redistribute the program under these conditions, and telling the user how to view a copy of this License. (Exception: if the Program itself is interactive but does not normally print such an announcement, your work based on the Program is not required to print an announcement.) These requirements apply to the modified work as a whole. If identifiable sections of that work are not derived from the Program, and can be reasonably considered independent and separate works in themselves, then this License, and its terms, do not apply to those sections when you distribute them as separate works. But when you distribute the same sections as part of a whole which is a work based on the Program, the distribution of the whole must be on the terms of this License, whose permissions for other licensees extend to the entire whole, and thus to each and every part regardless of who wrote it. Thus, it is not the intent of this section to claim rights or contest your rights to work written entirely by you; rather, the intent is to exercise the right to control the distribution of derivative or collective works based on the Program. In addition, mere aggregation of another work not based on the Program with the Program (or with a work based on the Program) on a volume of a storage or distribution medium does not bring the other work under the scope of this License. 3. You may copy and distribute the Program (or a work based on it, under Section 2) in object code or executable form under the terms of Sections 1 and 2 above provided that you also do one of the following: a) Accompany it with the complete corresponding machine-readable source code, which must be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, b) Accompany it with a written offer, valid for at least three years, to give any third party, for a charge no more than your cost of physically performing source distribution, a complete machine-readable copy of the corresponding source code, to be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, c) Accompany it with the information you received as to the offer to distribute corresponding source code. (This alternative is allowed only for noncommercial distribution and only if you received the program in object code or executable form with such an offer, in accord with Subsection b above.) The source code for a work means the preferred form of the work for making modifications to it. For an executable work, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the executable. However, as a special exception, the source code distributed need not include anything that is normally distributed (in either source or binary form) with the major components (compiler, kernel, and so on) of the operating system on which the executable runs, unless that component itself accompanies the executable. If distribution of executable or object code is made by offering access to copy from a designated place, then offering equivalent access to copy the source code from the same place counts as distribution of the source code, even though third parties are not compelled to copy the source along with the object code. 4. You may not copy, modify, sublicense, or distribute the Program except as expressly provided under this License. Any attempt otherwise to copy, modify, sublicense or distribute the Program is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance. 5. You are not required to accept this License, since you have not signed it. However, nothing else grants you permission to modify or distribute the Program or its derivative works. These actions are prohibited by law if you do not accept this License. Therefore, by modifying or distributing the Program (or any work based on the Program), you indicate your acceptance of this License to do so, and all its terms and conditions for copying, distributing or modifying the Program or works based on it. 6. Each time you redistribute the Program (or any work based on the Program), the recipient automatically receives a license from the original licensor to copy, distribute or modify the Program subject to these terms and conditions. You may not impose any further restrictions on the recipients' exercise of the rights granted herein. You are not responsible for enforcing compliance by third parties to this License. 7. If, as a consequence of a court judgment or allegation of patent infringement or for any other reason (not limited to patent issues), conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot distribute so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not distribute the Program at all. For example, if a patent license would not permit royalty-free redistribution of the Program by all those who receive copies directly or indirectly through you, then the only way you could satisfy both it and this License would be to refrain entirely from distribution of the Program. If any portion of this section is held invalid or unenforceable under any particular circumstance, the balance of the section is intended to apply and the section as a whole is intended to apply in other circumstances. It is not the purpose of this section to induce you to infringe any patents or other property right claims or to contest validity of any such claims; this section has the sole purpose of protecting the integrity of the free software distribution system, which is implemented by public license practices. Many people have made generous contributions to the wide range of software distributed through that system in reliance on consistent application of that system; it is up to the author/donor to decide if he or she is willing to distribute software through any other system and a licensee cannot impose that choice. This section is intended to make thoroughly clear what is believed to be a consequence of the rest of this License. 8. If the distribution and/or use of the Program is restricted in certain countries either by patents or by copyrighted interfaces, the original copyright holder who places the Program under this License may add an explicit geographical distribution limitation excluding those countries, so that distribution is permitted only in or among countries not thus excluded. In such case, this License incorporates the limitation as if written in the body of this License. 9. The Free Software Foundation may publish revised and/or new versions of the General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Program specifies a version number of this License which applies to it and "any later version", you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of this License, you may choose any version ever published by the Free Software Foundation. 10. If you wish to incorporate parts of the Program into other free programs whose distribution conditions are different, write to the author to ask for permission. For software which is copyrighted by the Free Software Foundation, write to the Free Software Foundation; we sometimes make exceptions for this. Our decision will be guided by the two goals of preserving the free status of all derivatives of our free software and of promoting the sharing and reuse of software generally. NO WARRANTY 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. END OF TERMS AND CONDITIONS How to Apply These Terms to Your New Programs If you develop a new program, and you want it to be of the greatest possible use to the public, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms. To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively convey the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found. Copyright (C) This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. Also add information on how to contact you by electronic and paper mail. If the program is interactive, make it output a short notice like this when it starts in an interactive mode: Gnomovision version 69, Copyright (C) year name of author Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. This is free software, and you are welcome to redistribute it under certain conditions; type `show c' for details. The hypothetical commands `show w' and `show c' should show the appropriate parts of the General Public License. Of course, the commands you use may be called something other than `show w' and `show c'; they could even be mouse-clicks or menu items--whatever suits your program. You should also get your employer (if you work as a programmer) or your school, if any, to sign a "copyright disclaimer" for the program, if necessary. Here is a sample; alter the names: Yoyodyne, Inc., hereby disclaims all copyright interest in the program `Gnomovision' (which makes passes at compilers) written by James Hacker. , 1 April 1989 Ty Coon, President of Vice This General Public License does not permit incorporating your program into proprietary programs. If your program is a subroutine library, you may consider it more useful to permit linking proprietary applications with the library. If this is what you want to do, use the GNU Lesser General Public License instead of this License. nbkenichi-bsfilter-f0a5a7c/README.md000066400000000000000000000002671465373635000172440ustar00rootroot00000000000000# bsfilter / bayesian spam filter [日本語ドキュメント](https://nbkenichi.github.io/bsfilter/index.html) [English Document](https://nbkenichi.github.io/bsfilter/index-e.html) nbkenichi-bsfilter-f0a5a7c/docs/000077500000000000000000000000001465373635000167105ustar00rootroot00000000000000nbkenichi-bsfilter-f0a5a7c/docs/bsfilter.css000066400000000000000000000010341465373635000212320ustar00rootroot00000000000000/* $Id: bsfilter.css,v 1.1 2004/03/07 13:00:37 nabeken Exp $ */ h1 {text-align: center} h2 {text-align: left} body {color: black; background-color: white} .fig {text-align: center} .icon {text-align: right} .version {text-align: right} .section {margin-bottom: 1em; margin-left: 2em} blockquote {color: green} cite {color: green} .blockcite {text-align: right} q {color: green} .quote {color: green} a img {border: 0; color: white} strong {color: red; font-size: 100%} a:link {color: blue} a:visited {color: navy} a:active {color: red} nbkenichi-bsfilter-f0a5a7c/docs/bsfilter.obj000066400000000000000000002134151465373635000212240ustar00rootroot00000000000000%TGIF 4.1.43-QPL state(0,37,100.000,123,128,0,8,1,9,1,1,0,1,2,0,1,0,'Ryumin-Light-EUC-H',0,80640,0,1,0,10,0,0,1,1,0,16,0,0,3,8,1,1,1088,1408,1,0,2880,0). % % @(#)$Header$ % %W% % unit("1 pixel/pixel"). color_info(11,65535,0,[ "magenta", 65535, 0, 65535, 65535, 0, 65535, 1, "red", 65535, 0, 0, 65535, 0, 0, 1, "green", 0, 65535, 0, 0, 65535, 0, 1, "blue", 0, 0, 65535, 0, 0, 65535, 1, "yellow", 65535, 65535, 0, 65535, 65535, 0, 1, "pink", 65535, 49344, 52171, 65535, 49344, 52171, 1, "cyan", 0, 65535, 65535, 0, 65535, 65535, 1, "CadetBlue", 24415, 40606, 41120, 24415, 40606, 41120, 1, "white", 65535, 65535, 65535, 65535, 65535, 65535, 1, "black", 0, 0, 0, 0, 0, 0, 1, "DarkSlateGray", 12079, 20303, 20303, 12079, 20303, 20303, 1 ]). script_frac("0.6"). fg_bg_colors('black','white'). dont_reencode("FFDingbests:ZapfDingbats"). page(1,"",1,''). icon([ group([ polygon('black','',5,[ 128,144,128,176,192,176,192,144,128,144],0,1,1,0,2427,0,0,0,0,0,'1',0, "00",[ ]), box('black','',132,148,188,172,0,1,0,2428,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',160,147,1,1,1,224,15,2429,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,160,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "clean", 1, 0, 0, text('black',159,151,1,1,1,31,17,2430,14,3,0,0,0,0,2,31,17,0,0,"",0,0,0,0,165,'',[ minilines(31,17,0,0,1,0,0,[ mini_line(31,14,3,0,0,0,[ str_block(0,31,14,3,0,0,0,0,0,[ str_seg('black','Times-Roman',0,80640,31,14,3,0,0,0,0,0,0,0, "clean")]) ]) ])])) ]) ], 2426,0,0,[ ]), poly('black','',3,[ 192,144,160,152,128,144],0,1,1,2431,0,0,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]) ], "letter",2425,0,0,0,0,[ ]). icon([ group([ box('black','',320,330,384,374,2,1,0,2434,0,0,0,0,0,'1',0,[ ]), oval('black','',320,320,384,340,2,1,1,2435,0,0,0,0,0,'1',0,[ ]), arc('black','',2,1,1,0,320,364,352,374,320,374,384,374,0,64,20,11520,11520,2436,0,0,8,3,0,0,0,'1','8','3',0,[ ]), poly('black','',2,[ 320,330,320,374],0,1,1,2437,0,2,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]), poly('black','',2,[ 384,330,384,374],0,1,1,2438,0,2,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]), box('black','',324,345,380,380,0,1,0,2439,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',352,344,1,1,1,224,15,2440,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,357,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "spam", 1, 0, 0, text('black',351,354,1,1,1,31,17,2441,14,3,0,0,0,0,2,31,17,0,0,"",0,0,0,0,368,'',[ minilines(31,17,0,0,1,0,0,[ mini_line(31,14,3,0,0,0,[ str_block(0,31,14,3,0,0,0,0,0,[ str_seg('black','Times-Roman',0,80640,31,14,3,0,0,0,0,0,0,0, "spam")]) ]) ])])) ]) ], 2433,0,0,[ ]), group([ polygon('black','',16,[ 320,272,320,288,320,304,336,304,344,304,344,320,360,304,432,304, 448,304,448,288,448,272,448,256,432,256,336,256,320,256,320,272],0,1,1,0,2443,0,0,0,0,0,'1',0, "2092",[ ]), box('black','',324,260,444,300,0,1,0,2444,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',384,259,1,1,1,224,15,2445,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,272,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "software 5", 1, 0, 0, text('black',353,263,2,0,1,61,34,2446,14,3,0,0,0,0,2,61,34,0,0,"",0,0,0,0,277,'',[ minilines(61,34,0,0,0,0,0,[ mini_line(61,14,3,0,0,0,[ str_block(0,61,14,3,0,-1,0,0,0,[ str_seg('black','Times-Roman',0,80640,61,14,3,0,-1,0,0,0,0,0, "software 5")]) ]), mini_line(56,14,3,0,0,0,[ str_block(0,56,14,3,0,-1,0,0,0,[ str_seg('black','Times-Roman',0,80640,56,14,3,0,-1,0,0,0,0,0, "money 99")]) ]) ])])) ]) ], 2442,0,0,[ ]) ], "database",2432,0,0,0,0,[ ]). icon([ group([ polygon('black','',5,[ 128,336,128,368,192,368,192,336,128,336],0,1,1,0,2449,0,0,0,0,0,'1',0, "00",[ ]), box('black','',132,340,188,364,0,1,0,2450,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',160,339,1,1,1,224,15,2451,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,352,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "spam", 1, 0, 0, text('black',159,343,1,1,1,31,17,2452,14,3,0,0,0,0,2,31,17,0,0,"",0,0,0,0,357,'',[ minilines(31,17,0,0,1,0,0,[ mini_line(31,14,3,0,0,0,[ str_block(0,31,14,3,0,0,0,0,0,[ str_seg('black','Times-Roman',0,80640,31,14,3,0,0,0,0,0,0,0, "spam")]) ]) ])])) ]) ], 2448,0,0,[ ]), poly('black','',3,[ 192,336,160,344,128,336],0,1,1,2453,0,0,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]) ], "letter",2447,0,0,0,0,[ ]). icon([ group([ box('black','',320,138,384,182,2,1,0,2456,0,0,0,0,0,'1',0,[ ]), oval('black','',320,128,384,148,2,1,1,2457,0,0,0,0,0,'1',0,[ ]), arc('black','',2,1,1,0,320,172,352,182,320,182,384,182,0,64,20,11520,11520,2458,0,0,8,3,0,0,0,'1','8','3',0,[ ]), poly('black','',2,[ 320,138,320,182],0,1,1,2459,0,2,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]), poly('black','',2,[ 384,138,384,182],0,1,1,2460,0,2,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]), box('black','',324,153,380,188,0,1,0,2461,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',352,152,1,1,1,224,15,2462,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,165,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "clean", 1, 0, 0, text('black',351,162,1,1,1,31,17,2463,14,3,0,0,0,0,2,31,17,0,0,"",0,0,0,0,176,'',[ minilines(31,17,0,0,1,0,0,[ mini_line(31,14,3,0,0,0,[ str_block(0,31,14,3,0,0,0,0,0,[ str_seg('black','Times-Roman',0,80640,31,14,3,0,0,0,0,0,0,0, "clean")]) ]) ])])) ]) ], 2455,0,0,[ ]), group([ polygon('black','',16,[ 320,80,320,96,320,112,336,112,344,112,344,128,360,112,432,112, 448,112,448,96,448,80,448,64,432,64,336,64,320,64,320,80],0,1,1,0,2465,0,0,0,0,0,'1',0, "2092",[ ]), box('black','',324,68,444,108,0,1,0,2466,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',384,67,1,1,1,224,15,2467,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,80,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "software 20", 1, 0, 0, text('black',350,71,2,0,1,68,34,2468,14,3,0,0,0,0,2,68,34,0,0,"",0,0,0,0,85,'',[ minilines(68,34,0,0,0,0,0,[ mini_line(68,14,3,0,0,0,[ str_block(0,68,14,3,0,-1,0,0,0,[ str_seg('black','Times-Roman',0,80640,68,14,3,0,-1,0,0,0,0,0, "software 20")]) ]), mini_line(49,14,3,0,0,0,[ str_block(0,49,14,3,0,-1,0,0,0,[ str_seg('black','Times-Roman',0,80640,49,14,3,0,-1,0,0,0,0,0, "money 1")]) ]) ])])) ]) ], 2464,0,0,[ ]) ], "database",2454,0,0,0,0,[ ]). icon([ group([ box('black','',512,330,576,374,2,1,0,2498,0,0,0,0,0,'1',0,[ ]), oval('black','',512,320,576,340,2,1,1,2497,0,0,0,0,0,'1',0,[ ]), arc('black','',2,1,1,0,512,364,544,374,512,374,576,374,0,64,20,11520,11520,2496,0,0,8,3,0,0,0,'1','8','3',0,[ ]), poly('black','',2,[ 512,330,512,374],0,1,1,2495,0,2,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]), poly('black','',2,[ 576,330,576,374],0,1,1,2494,0,2,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]), box('black','',516,345,572,380,0,1,0,2491,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',544,344,1,1,1,224,15,2493,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,357,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "probability", 1, 0, 0, text('black',544,354,1,1,1,60,17,2492,14,3,0,0,0,0,2,60,17,0,0,"",0,0,0,0,368,'',[ minilines(60,17,0,0,1,0,0,[ mini_line(60,14,3,0,0,0,[ str_block(0,60,14,3,0,0,0,0,0,[ str_seg('black','Times-Roman',0,80640,60,14,3,0,0,0,0,0,0,0, "probability")]) ]) ])])) ]) ], 2490,0,0,[ ]), group([ polygon('black','',16,[ 512,272,512,288,512,304,528,304,536,304,536,320,552,304,624,304, 640,304,640,288,640,272,640,256,624,256,528,256,512,256,512,272],0,1,1,0,2489,0,0,0,0,0,'1',0, "2092",[ ]), box('black','',516,260,636,300,0,1,0,2486,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',576,259,1,1,1,224,15,2488,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,272,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "software 20%", 1, 0, 0, text('black',536,263,2,0,1,80,34,2487,14,3,0,0,0,0,2,80,34,0,0,"",0,0,0,0,277,'',[ minilines(80,34,0,0,0,0,0,[ mini_line(80,14,3,0,0,0,[ str_block(0,80,14,3,0,-2,0,0,0,[ str_seg('black','Times-Roman',0,80640,80,14,3,0,-2,0,0,0,0,0, "software 20%")]) ]), mini_line(68,14,3,0,0,0,[ str_block(0,68,14,3,0,-2,0,0,0,[ str_seg('black','Times-Roman',0,80640,68,14,3,0,-2,0,0,0,0,0, "money 99%")]) ]) ])])) ]) ], 2485,0,0,[ ]) ], "database",2484,0,0,0,0,[ ]). icon([ group([ polygon('black','',5,[ 512,464,512,496,576,496,576,464,512,464],0,1,1,0,2519,0,0,0,0,0,'1',0, "00",[ ]), box('black','',516,468,572,492,0,1,0,2516,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',544,467,1,1,1,224,15,2518,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,480,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "new mail", 1, 0, 0, text('black',544,471,1,1,1,52,17,2517,14,3,0,0,0,0,2,52,17,0,0,"",0,0,0,0,485,'',[ minilines(52,17,0,0,1,0,0,[ mini_line(52,14,3,0,0,0,[ str_block(0,52,14,3,0,0,0,0,0,[ str_seg('black','Times-Roman',0,80640,52,14,3,0,0,0,0,0,0,0, "new mail")]) ]) ])])) ]) ], 2515,0,0,[ ]), poly('black','',3,[ 576,464,544,472,512,464],0,1,1,2514,0,0,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]) ], "letter",2513,0,0,0,0,[ ]). group([ polygon('black','',33,[ 704,416,692,420,702,428,689,428,695,439,684,433,684,446,676,436, 672,448,668,436,660,446,660,433,649,439,655,428,642,428,652,420, 640,416,652,412,642,404,655,404,649,393,660,399,660,386,668,396, 672,384,676,396,684,386,684,399,695,393,689,404,702,404,692,412, 704,416],0,1,1,0,2524,0,0,0,0,0,'1',0, "000000000",[ ]), box('black','',644,388,700,444,0,1,0,2521,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',672,387,1,1,1,224,15,2523,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,400,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "spam?", 1, 0, 0, text('black',671,407,1,1,1,37,17,2522,14,3,0,0,0,0,2,37,17,0,0,"",0,0,0,0,421,'',[ minilines(37,17,0,0,1,0,0,[ mini_line(37,14,3,0,0,0,[ str_block(0,37,14,3,0,-1,0,0,0,[ str_seg('black','Times-Roman',0,80640,37,14,3,0,-1,0,0,0,0,0, "spam?")]) ]) ])])) ]) ], 2520,0,0,[ ]). poly('black','',2,[ 224,352,304,352],1,1,1,2525,0,0,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]). poly('black','',2,[ 592,368,640,384],1,1,1,2526,0,0,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]). poly('black','',2,[ 592,464,640,448],1,1,1,2528,0,0,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]). poly('black','',2,[ 224,160,304,160],1,1,1,2531,0,0,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]). poly('black','',2,[ 448,160,496,320],1,1,1,2533,0,0,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]). text('black',224,178,1,0,1,67,17,2548,14,3,0,0,0,0,2,67,17,0,0,"",0,0,0,0,192,'',[ minilines(67,17,0,0,0,0,0,[ mini_line(67,14,3,0,0,0,[ str_block(0,67,14,3,0,0,0,0,0,[ str_seg('black','Times-Roman',0,80640,67,14,3,0,0,0,0,0,0,0, "--add-clean")]) ]) ])]). text('black',224,370,1,0,1,67,17,2550,14,3,0,0,0,0,2,67,17,0,0,"",0,0,0,0,384,'',[ minilines(67,17,0,0,0,0,0,[ mini_line(67,14,3,0,0,0,[ str_block(0,67,14,3,0,0,0,0,0,[ str_seg('black','Times-Roman',0,80640,67,14,3,0,0,0,0,0,0,0, "--add-spam")]) ]) ])]). text('black',416,370,1,0,1,49,17,2557,14,3,0,0,0,0,2,49,17,0,0,"",0,0,0,0,384,'',[ minilines(49,17,0,0,0,0,0,[ mini_line(49,14,3,0,0,0,[ str_block(0,49,14,3,0,0,0,0,0,[ str_seg('black','Times-Roman',0,80640,49,14,3,0,0,0,0,0,0,0, "--update")]) ]) ])]). group([ polygon('black','',13,[ 320,464,320,496,320,512,336,512,432,512,448,512,448,496,448,464, 448,448,432,448,336,448,320,448,320,464],0,1,1,0,2561,0,0,0,0,0,'1',0, "2490",[ ]), box('black','',324,452,444,508,0,1,0,2562,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',384,451,1,1,1,224,15,2563,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,464,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "POP server", 1, 0, 0, text('black',384,471,1,1,1,66,17,2564,14,3,0,0,0,0,2,66,17,0,0,"",0,0,0,0,485,'',[ minilines(66,17,0,0,1,0,0,[ mini_line(66,14,3,0,0,0,[ str_block(0,66,14,3,0,0,0,0,0,[ str_seg('black','Times-Roman',0,80640,66,14,3,0,0,0,0,0,0,0, "POP server")]) ]) ])])) ]) ], 2565,0,0,[ ]). icon([ group([ polygon('black','',5,[ 768,464,768,496,832,496,832,464,768,464],0,1,1,0,2582,0,0,0,0,0,'1',0, "00",[ ]), box('black','',772,468,828,492,0,1,0,2583,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',800,467,1,1,1,224,15,2584,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,480,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "new mail", 1, 0, 0, text('black',800,471,1,1,1,52,17,2585,14,3,0,0,0,0,2,52,17,0,0,"",0,0,0,0,485,'',[ minilines(52,17,0,0,1,0,0,[ mini_line(52,14,3,0,0,0,[ str_block(0,52,14,3,0,0,0,0,0,[ str_seg('black','Times-Roman',0,80640,52,14,3,0,0,0,0,0,0,0, "new mail")]) ]) ])])) ]) ], 2581,0,0,[ ]), poly('black','',3,[ 832,464,800,472,768,464],0,1,1,2586,0,0,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]) ], "letter",2580,0,0,0,0,[ ]). group([ polygon('black','',16,[ 832,544,832,528,832,512,816,512,808,512,808,496,792,512,720,512, 704,512,704,528,704,544,704,560,720,560,816,560,832,560,832,544],0,1,1,0,2587,0,0,0,0,0,'1',0, "2092",[ ]), box('black','',708,516,828,556,0,1,0,2588,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',768,515,1,1,1,224,15,2589,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,528,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "X-Spam-Flag: Yes/No", 1, 0, 0, text('black',768,527,1,1,1,129,17,2590,14,3,0,0,0,0,2,129,17,0,0,"",0,0,0,0,541,'',[ minilines(129,17,0,0,1,0,0,[ mini_line(129,14,3,0,0,0,[ str_block(0,129,14,3,0,-1,0,0,0,[ str_seg('black','Times-Roman',0,80640,129,14,3,0,-1,0,0,0,0,0, "X-Spam-Flag: Yes/No")]) ]) ])])) ]) ], 2591,0,0,[ ]). poly('black','',2,[ 704,448,752,464],1,1,1,2602,0,0,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]). group([ polygon('black','',13,[ 896,464,896,496,896,512,912,512,1008,512,1024,512,1024,496,1024,464, 1024,448,1008,448,912,448,896,448,896,464],0,1,1,0,2604,0,0,0,0,0,'1',0, "2490",[ ]), box('black','',900,452,1020,508,0,1,0,2605,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',960,451,1,1,1,224,15,2606,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,464,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "MUA", 1, 0, 0, text('black',960,471,1,1,1,34,17,2607,14,3,0,0,0,0,2,34,17,0,0,"",0,0,0,0,485,'',[ minilines(34,17,0,0,1,0,0,[ mini_line(34,14,3,0,0,0,[ str_block(0,34,14,3,0,-1,0,0,0,[ str_seg('black','Times-Roman',0,80640,34,14,3,0,-1,0,0,0,0,0, "MUA")]) ]) ])])) ]) ], 2603,0,0,[ ]). poly('black','',2,[ 416,352,496,352],1,1,1,2611,0,0,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]). poly('black','',2,[ 464,480,496,480],1,1,1,2614,0,0,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]). poly('black','',2,[ 848,480,880,480],1,1,1,2624,0,0,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]). group([ polygon('black','',13,[ 256,624,256,720,256,768,280,768,424,768,448,768,448,720,448,624, 448,576,424,576,280,576,256,576,256,624],0,1,1,0,2626,0,0,0,0,0,'1',0, "2490",[ ]), box('black','',262,588,442,756,0,1,0,2627,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',352,585,1,1,1,224,15,2628,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,598,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "IMAP server", 1, 0, 0, text('black',351,663,1,1,1,77,17,2629,14,3,0,0,0,0,2,77,17,0,0,"",0,0,0,0,677,'',[ minilines(77,17,0,0,1,0,0,[ mini_line(77,14,3,0,0,0,[ str_block(0,77,14,3,0,0,0,0,0,[ str_seg('black','Times-Roman',0,80640,77,14,3,0,0,0,0,0,0,0, "IMAP server")]) ]) ])])) ]) ], 2625,0,0,[ ]). icon([ group([ polygon('black','',5,[ 512,608,512,640,576,640,576,608,512,608],0,1,1,0,2639,0,0,0,0,0,'1',0, "00",[ ]), box('black','',516,612,572,636,0,1,0,2640,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',544,611,1,1,1,224,15,2641,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,624,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "new mail", 1, 0, 0, text('black',544,615,1,1,1,52,17,2642,14,3,0,0,0,0,2,52,17,0,0,"",0,0,0,0,629,'',[ minilines(52,17,0,0,1,0,0,[ mini_line(52,14,3,0,0,0,[ str_block(0,52,14,3,0,0,0,0,0,[ str_seg('black','Times-Roman',0,80640,52,14,3,0,0,0,0,0,0,0, "new mail")]) ]) ])])) ]) ], 2638,0,0,[ ]), poly('black','',3,[ 576,608,544,616,512,608],0,1,1,2643,0,0,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]) ], "letter",2637,0,0,0,0,[ ]). poly('black','',2,[ 432,624,496,624],1,1,1,2644,0,0,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]). group([ polygon('black','',16,[ 576,784,576,768,576,752,560,752,552,752,552,736,536,752,464,752, 448,752,448,768,448,784,448,800,464,800,560,800,576,800,576,784],0,1,1,0,2646,0,0,0,0,0,'1',0, "2092",[ ]), box('black','',452,756,572,796,0,1,0,2647,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',512,755,1,1,1,224,15,2648,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,768,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "X-Spam-Flag: Yes/No", 1, 0, 0, text('black',512,767,1,1,1,129,17,2649,14,3,0,0,0,0,2,129,17,0,0,"",0,0,0,0,781,'',[ minilines(129,17,0,0,1,0,0,[ mini_line(129,14,3,0,0,0,[ str_block(0,129,14,3,0,-1,0,0,0,[ str_seg('black','Times-Roman',0,80640,129,14,3,0,-1,0,0,0,0,0, "X-Spam-Flag: Yes/No")]) ]) ])])) ]) ], 2645,0,0,[ ]). icon([ group([ polygon('black','',5,[ 512,704,512,736,576,736,576,704,512,704],0,1,1,0,2652,0,0,0,0,0,'1',0, "00",[ ]), box('black','',516,708,572,732,0,1,0,2653,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',544,707,1,1,1,224,15,2654,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,720,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "new mail", 1, 0, 0, text('black',544,711,1,1,1,52,17,2655,14,3,0,0,0,0,2,52,17,0,0,"",0,0,0,0,725,'',[ minilines(52,17,0,0,1,0,0,[ mini_line(52,14,3,0,0,0,[ str_block(0,52,14,3,0,0,0,0,0,[ str_seg('black','Times-Roman',0,80640,52,14,3,0,0,0,0,0,0,0, "new mail")]) ]) ])])) ]) ], 2651,0,0,[ ]), poly('black','',3,[ 576,704,544,712,512,704],0,1,1,2656,0,0,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]) ], "letter",2650,0,0,0,0,[ ]). poly('black','',2,[ 432,720,496,720],2,1,1,2659,0,0,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]). group([ polygon('black','',33,[ 704,672,692,676,702,684,689,684,695,695,684,689,684,702,676,692, 672,704,668,692,660,702,660,689,649,695,655,684,642,684,652,676, 640,672,652,668,642,660,655,660,649,649,660,655,660,642,668,652, 672,640,676,652,684,642,684,655,695,649,689,660,702,660,692,668, 704,672],0,1,1,0,2662,0,0,0,0,0,'1',0, "000000000",[ ]), box('black','',644,644,700,700,0,1,0,2663,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',672,643,1,1,1,224,15,2664,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,656,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "spam?", 1, 0, 0, text('black',671,663,1,1,1,37,17,2665,14,3,0,0,0,0,2,37,17,0,0,"",0,0,0,0,677,'',[ minilines(37,17,0,0,1,0,0,[ mini_line(37,14,3,0,0,0,[ str_block(0,37,14,3,0,-1,0,0,0,[ str_seg('black','Times-Roman',0,80640,37,14,3,0,-1,0,0,0,0,0, "spam?")]) ]) ])])) ]) ], 2661,0,0,[ ]). poly('black','',2,[ 592,624,640,640],1,1,1,2666,0,0,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]). poly('black','',2,[ 592,720,640,704],2,1,1,2670,0,0,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]). group([ polygon('black','',5,[ 336,688,336,752,416,752,416,688,336,688],0,1,1,0,2729,0,0,0,0,0,'1',0, "00",[ ]), box('black','',341,692,411,748,0,1,0,2730,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',376,691,1,1,1,224,15,2731,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,704,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "INBOX.spam", 1, 0, 0, text('black',376,711,1,1,1,79,17,2732,14,3,0,0,0,0,2,79,17,0,0,"",0,0,0,0,725,'',[ minilines(79,17,0,0,1,0,0,[ mini_line(79,14,3,0,0,0,[ str_block(0,79,14,3,0,0,0,0,0,[ str_seg('black','Times-Roman',0,80640,79,14,3,0,0,0,0,0,0,0, "INBOX.spam")]) ]) ])])) ]) ], 2733,0,0,[ ]). group([ polygon('black','',5,[ 336,592,336,656,416,656,416,592,336,592],0,1,1,0,2761,0,0,0,0,0,'1',0, "00",[ ]), box('black','',341,596,411,652,0,1,0,2762,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',376,595,1,1,1,224,15,2763,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,608,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "INBOX", 1, 0, 0, text('black',375,615,1,1,1,45,17,2764,14,3,0,0,0,0,2,45,17,0,0,"",0,0,0,0,629,'',[ minilines(45,17,0,0,1,0,0,[ mini_line(45,14,3,0,0,0,[ str_block(0,45,14,3,0,0,0,0,0,[ str_seg('black','Times-Roman',0,80640,45,14,3,0,0,0,0,0,0,0, "INBOX")]) ]) ])])) ]) ], 2760,0,0,[ ]). page(2,"",1,''). group([ polygon('black','',13,[ 256,624,256,720,256,768,280,768,424,768,448,768,448,720,448,624, 448,576,424,576,280,576,256,576,256,624],0,1,1,0,3835,0,0,0,0,0,'1',0, "2490",[ ]), box('black','',262,588,442,756,0,1,0,3836,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',352,585,1,1,1,224,15,3837,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,598,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "IMAP server", 1, 0, 0, text('black',351,663,1,1,1,77,17,3838,14,3,0,0,0,0,2,77,17,0,0,"",0,0,0,0,677,'',[ minilines(77,17,0,0,1,0,0,[ mini_line(77,14,3,0,0,0,[ str_block(0,77,14,3,0,0,0,0,0,[ str_seg('black','Times-Roman',0,80640,77,14,3,0,0,0,0,0,0,0, "IMAP server")]) ]) ])])) ]) ], 3834,0,0,[ ]). icon([ group([ polygon('black','',5,[ 512,608,512,640,576,640,576,608,512,608],0,1,1,0,3841,0,0,0,0,0,'1',0, "00",[ ]), box('black','',516,612,572,636,0,1,0,3842,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',544,611,1,1,1,224,15,3843,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,624,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "new mail", 1, 0, 0, text('black',544,615,1,1,1,52,17,3844,14,3,0,0,0,0,2,52,17,0,0,"",0,0,0,0,629,'',[ minilines(52,17,0,0,1,0,0,[ mini_line(52,14,3,0,0,0,[ str_block(0,52,14,3,0,0,0,0,0,[ str_seg('black','Times-Roman',0,80640,52,14,3,0,0,0,0,0,0,0, "new mail")]) ]) ])])) ]) ], 3840,0,0,[ ]), poly('black','',3,[ 576,608,544,616,512,608],0,1,1,3845,0,0,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]) ], "letter",3839,0,0,0,0,[ ]). poly('black','',2,[ 432,624,496,624],1,1,1,3846,0,0,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]). group([ polygon('black','',16,[ 576,784,576,768,576,752,560,752,552,752,552,736,536,752,464,752, 448,752,448,768,448,784,448,800,464,800,560,800,576,800,576,784],0,1,1,0,3848,0,0,0,0,0,'1',0, "2092",[ ]), box('black','',452,756,572,796,0,1,0,3849,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',512,755,1,1,1,224,15,3850,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,768,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "X-Spam-Flag: Yes/No", 1, 0, 0, text('black',512,767,1,1,1,129,17,3851,14,3,0,0,0,0,2,129,17,0,0,"",0,0,0,0,781,'',[ minilines(129,17,0,0,1,0,0,[ mini_line(129,14,3,0,0,0,[ str_block(0,129,14,3,0,-1,0,0,0,[ str_seg('black','Times-Roman',0,80640,129,14,3,0,-1,0,0,0,0,0, "X-Spam-Flag: Yes/No")]) ]) ])])) ]) ], 3847,0,0,[ ]). icon([ group([ polygon('black','',5,[ 512,704,512,736,576,736,576,704,512,704],0,1,1,0,3854,0,0,0,0,0,'1',0, "00",[ ]), box('black','',516,708,572,732,0,1,0,3855,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',544,707,1,1,1,224,15,3856,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,720,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "new mail", 1, 0, 0, text('black',544,711,1,1,1,52,17,3857,14,3,0,0,0,0,2,52,17,0,0,"",0,0,0,0,725,'',[ minilines(52,17,0,0,1,0,0,[ mini_line(52,14,3,0,0,0,[ str_block(0,52,14,3,0,0,0,0,0,[ str_seg('black','Times-Roman',0,80640,52,14,3,0,0,0,0,0,0,0, "new mail")]) ]) ])])) ]) ], 3853,0,0,[ ]), poly('black','',3,[ 576,704,544,712,512,704],0,1,1,3858,0,0,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]) ], "letter",3852,0,0,0,0,[ ]). poly('black','',2,[ 432,720,496,720],2,1,1,3859,0,0,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]). group([ polygon('black','',33,[ 704,672,692,676,702,684,689,684,695,695,684,689,684,702,676,692, 672,704,668,692,660,702,660,689,649,695,655,684,642,684,652,676, 640,672,652,668,642,660,655,660,649,649,660,655,660,642,668,652, 672,640,676,652,684,642,684,655,695,649,689,660,702,660,692,668, 704,672],0,1,1,0,3861,0,0,0,0,0,'1',0, "000000000",[ ]), box('black','',644,644,700,700,0,1,0,3862,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',672,643,1,1,1,224,15,3863,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,656,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "spam?", 1, 0, 0, text('black',671,663,1,1,1,37,17,3864,14,3,0,0,0,0,2,37,17,0,0,"",0,0,0,0,677,'',[ minilines(37,17,0,0,1,0,0,[ mini_line(37,14,3,0,0,0,[ str_block(0,37,14,3,0,-1,0,0,0,[ str_seg('black','Times-Roman',0,80640,37,14,3,0,-1,0,0,0,0,0, "spam?")]) ]) ])])) ]) ], 3860,0,0,[ ]). poly('black','',2,[ 592,624,640,640],1,1,1,3865,0,0,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]). poly('black','',2,[ 592,720,640,704],2,1,1,3866,0,0,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]). group([ polygon('black','',5,[ 336,688,336,752,416,752,416,688,336,688],0,1,1,0,3868,0,0,0,0,0,'1',0, "00",[ ]), box('black','',341,692,411,748,0,1,0,3869,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',376,691,1,1,1,224,15,3870,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,704,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "INBOX.spam", 1, 0, 0, text('black',376,711,1,1,1,79,17,3871,14,3,0,0,0,0,2,79,17,0,0,"",0,0,0,0,725,'',[ minilines(79,17,0,0,1,0,0,[ mini_line(79,14,3,0,0,0,[ str_block(0,79,14,3,0,0,0,0,0,[ str_seg('black','Times-Roman',0,80640,79,14,3,0,0,0,0,0,0,0, "INBOX.spam")]) ]) ])])) ]) ], 3867,0,0,[ ]). group([ polygon('black','',5,[ 336,592,336,656,416,656,416,592,336,592],0,1,1,0,3873,0,0,0,0,0,'1',0, "00",[ ]), box('black','',341,596,411,652,0,1,0,3874,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',376,595,1,1,1,224,15,3875,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,608,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "INBOX", 1, 0, 0, text('black',375,615,1,1,1,45,17,3876,14,3,0,0,0,0,2,45,17,0,0,"",0,0,0,0,629,'',[ minilines(45,17,0,0,1,0,0,[ mini_line(45,14,3,0,0,0,[ str_block(0,45,14,3,0,0,0,0,0,[ str_seg('black','Times-Roman',0,80640,45,14,3,0,0,0,0,0,0,0, "INBOX")]) ]) ])])) ]) ], 3872,0,0,[ ]). page(3,"",1,''). icon([ group([ box('black','',552,586,616,630,2,1,0,3451,0,0,0,0,0,'1',0,[ ]), oval('black','',552,576,616,596,2,1,1,3450,0,0,0,0,0,'1',0,[ ]), arc('black','',2,1,1,0,552,620,584,630,552,630,616,630,0,64,20,11520,11520,3449,0,0,8,3,0,0,0,'1','8','3',0,[ ]), poly('black','',2,[ 552,586,552,630],0,1,1,3448,0,2,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]), poly('black','',2,[ 616,586,616,630],0,1,1,3447,0,2,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]), box('black','',556,601,612,636,0,1,0,3444,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',584,600,1,1,1,224,15,3446,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,613,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "probability", 1, 0, 0, text('black',584,610,1,1,1,60,17,3445,14,3,0,0,0,0,2,60,17,0,0,"",0,0,0,0,624,'',[ minilines(60,17,0,0,1,0,0,[ mini_line(60,14,3,0,0,0,[ str_block(0,60,14,3,0,0,0,0,0,[ str_seg('black','Times-Roman',0,80640,60,14,3,0,0,0,0,0,0,0, "probability")]) ]) ])])) ]) ], 3443,0,0,[ ]), group([ polygon('black','',16,[ 552,528,552,544,552,560,568,560,576,560,576,576,592,560,664,560, 680,560,680,544,680,528,680,512,664,512,568,512,552,512,552,528],0,1,1,0,3442,0,0,0,0,0,'1',0, "2092",[ ]), box('black','',556,516,676,556,0,1,0,3439,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',616,515,1,1,1,224,15,3441,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,528,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "software 20%", 1, 0, 0, text('black',576,519,2,0,1,80,34,3440,14,3,0,0,0,0,2,80,34,0,0,"",0,0,0,0,533,'',[ minilines(80,34,0,0,0,0,0,[ mini_line(80,14,3,0,0,0,[ str_block(0,80,14,3,0,-2,0,0,0,[ str_seg('black','Times-Roman',0,80640,80,14,3,0,-2,0,0,0,0,0, "software 20%")]) ]), mini_line(68,14,3,0,0,0,[ str_block(0,68,14,3,0,-2,0,0,0,[ str_seg('black','Times-Roman',0,80640,68,14,3,0,-2,0,0,0,0,0, "money 99%")]) ]) ])])) ]) ], 3438,0,0,[ ]) ], "database",3437,0,0,0,0,[ ]). icon([ group([ polygon('black','',5,[ 552,720,552,752,616,752,616,720,552,720],0,1,1,0,3458,0,0,0,0,0,'1',0, "00",[ ]), box('black','',556,724,612,748,0,1,0,3455,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',584,723,1,1,1,224,15,3457,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,736,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "new mail", 1, 0, 0, text('black',584,727,1,1,1,52,17,3456,14,3,0,0,0,0,2,52,17,0,0,"",0,0,0,0,741,'',[ minilines(52,17,0,0,1,0,0,[ mini_line(52,14,3,0,0,0,[ str_block(0,52,14,3,0,0,0,0,0,[ str_seg('black','Times-Roman',0,80640,52,14,3,0,0,0,0,0,0,0, "new mail")]) ]) ])])) ]) ], 3454,0,0,[ ]), poly('black','',3,[ 616,720,584,728,552,720],0,1,1,3453,0,0,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]) ], "letter",3452,0,0,0,0,[ ]). group([ polygon('black','',33,[ 744,672,732,676,742,684,729,684,735,695,724,689,724,702,716,692, 712,704,708,692,700,702,700,689,689,695,695,684,682,684,692,676, 680,672,692,668,682,660,695,660,689,649,700,655,700,642,708,652, 712,640,716,652,724,642,724,655,735,649,729,660,742,660,732,668, 744,672],0,1,1,0,3463,0,0,0,0,0,'1',0, "000000000",[ ]), box('black','',684,644,740,700,0,1,0,3460,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',712,643,1,1,1,224,15,3462,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,656,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "spam?", 1, 0, 0, text('black',711,663,1,1,1,37,17,3461,14,3,0,0,0,0,2,37,17,0,0,"",0,0,0,0,677,'',[ minilines(37,17,0,0,1,0,0,[ mini_line(37,14,3,0,0,0,[ str_block(0,37,14,3,0,-1,0,0,0,[ str_seg('black','Times-Roman',0,80640,37,14,3,0,-1,0,0,0,0,0, "spam?")]) ]) ])])) ]) ], 3459,0,0,[ ]). poly('black','',2,[ 632,624,680,640],1,1,1,3465,0,0,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]). poly('black','',2,[ 632,720,680,704],1,1,1,3466,0,0,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]). icon([ group([ polygon('black','',5,[ 808,720,808,752,872,752,872,720,808,720],0,1,1,0,3483,0,0,0,0,0,'1',0, "00",[ ]), box('black','',812,724,868,748,0,1,0,3480,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',840,723,1,1,1,224,15,3482,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,736,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "new mail", 1, 0, 0, text('black',840,727,1,1,1,52,17,3481,14,3,0,0,0,0,2,52,17,0,0,"",0,0,0,0,741,'',[ minilines(52,17,0,0,1,0,0,[ mini_line(52,14,3,0,0,0,[ str_block(0,52,14,3,0,0,0,0,0,[ str_seg('black','Times-Roman',0,80640,52,14,3,0,0,0,0,0,0,0, "new mail")]) ]) ])])) ]) ], 3479,0,0,[ ]), poly('black','',3,[ 872,720,840,728,808,720],0,1,1,3478,0,0,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]) ], "letter",3477,0,0,0,0,[ ]). group([ polygon('black','',16,[ 872,800,872,784,872,768,856,768,848,768,848,752,832,768,760,768, 744,768,744,784,744,800,744,816,760,816,856,816,872,816,872,800],0,1,1,0,3488,0,0,0,0,0,'1',0, "2092",[ ]), box('black','',748,772,868,812,0,1,0,3485,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',808,771,1,1,1,224,15,3487,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,784,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "X-Spam-Flag: Yes/No", 1, 0, 0, text('black',808,783,1,1,1,129,17,3486,14,3,0,0,0,0,2,129,17,0,0,"",0,0,0,0,797,'',[ minilines(129,17,0,0,1,0,0,[ mini_line(129,14,3,0,0,0,[ str_block(0,129,14,3,0,-1,0,0,0,[ str_seg('black','Times-Roman',0,80640,129,14,3,0,-1,0,0,0,0,0, "X-Spam-Flag: Yes/No")]) ]) ])])) ]) ], 3484,0,0,[ ]). poly('black','',2,[ 744,704,792,720],1,1,1,3489,0,0,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]). group([ polygon('black','',13,[ 320,336,320,368,320,384,336,384,432,384,448,384,448,368,448,336, 448,320,432,320,336,320,320,320,320,336],0,1,1,0,4975,0,0,0,0,0,'1',0, "2490",[ ]), box('black','',324,324,444,380,0,1,0,4972,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',384,323,1,1,1,224,15,4974,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,336,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "POP server", 1, 0, 0, text('black',384,343,1,1,1,66,17,4973,14,3,0,0,0,0,2,66,17,0,0,"",0,0,0,0,357,'',[ minilines(66,17,0,0,1,0,0,[ mini_line(66,14,3,0,0,0,[ str_block(0,66,14,3,0,0,0,0,0,[ str_seg('black','Times-Roman',0,80640,66,14,3,0,0,0,0,0,0,0, "POP server")]) ]) ])])) ]) ], 4971,0,0,[ ]). group([ polygon('black','',13,[ 896,336,896,368,896,384,912,384,1008,384,1024,384,1024,368,1024,336, 1024,320,1008,320,912,320,896,320,896,336],0,1,1,0,4980,0,0,0,0,0,'1',0, "2490",[ ]), box('black','',900,324,1020,380,0,1,0,4977,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',960,323,1,1,1,224,15,4979,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,336,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "MUA", 1, 0, 0, text('black',960,343,1,1,1,34,17,4978,14,3,0,0,0,0,2,34,17,0,0,"",0,0,0,0,357,'',[ minilines(34,17,0,0,1,0,0,[ mini_line(34,14,3,0,0,0,[ str_block(0,34,14,3,0,-1,0,0,0,[ str_seg('black','Times-Roman',0,80640,34,14,3,0,-1,0,0,0,0,0, "MUA")]) ]) ])])) ]) ], 4976,0,0,[ ]). poly('black','',2,[ 776,352,888,352],1,1,1,4981,0,0,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]). text('black',464,266,1,0,1,126,17,4982,14,3,0,0,0,0,2,126,17,0,0,"",0,0,0,0,280,'',[ minilines(126,17,0,0,0,0,0,[ mini_line(126,14,3,0,0,0,[ str_block(0,126,14,3,0,-1,0,0,0,[ str_seg('black','Times-Roman',0,80640,126,14,3,0,-1,0,0,0,0,0, "pop.example.com:110")]) ]) ])]). oval('black','',528,288,528,288,1,1,1,4983,0,0,0,0,0,'1',0,[ ]). polygon('black','',16,[ 448,262,448,285,448,290,464,290,472,290,448,352,488,290,592,290, 608,290,608,285,608,262,608,256,592,256,464,256,448,256,448,262],0,1,1,0,4984,0,0,0,0,0,'1',0, "2092",[ ]). box('black','',460,257,612,289,0,1,0,4985,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',536,257,1,1,1,224,15,4987,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,270,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "", 1, 0, 0, text('black',536,265,1,1,1,0,15,4986,13,2,0,0,0,0,2,0,15,0,0,"",0,0,0,0,278,'',[ minilines(0,15,0,0,1,0,0,[ mini_line(0,13,2,0,0,0,[ str_block(0,0,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,0,13,2,0,0,0,1,1,0,0, "")]) ]) ])])) ]). text('black',784,266,1,0,1,90,17,4988,14,3,0,0,0,0,2,90,17,0,0,"",0,0,0,0,280,'',[ minilines(90,17,0,0,0,0,0,[ mini_line(90,14,3,0,0,0,[ str_block(0,90,14,3,0,-1,0,0,0,[ str_seg('black','Times-Roman',0,80640,90,14,3,0,-1,0,0,0,0,0, "localhost:10110")]) ]) ])]). oval('black','',848,288,848,288,1,1,1,4989,0,0,0,0,0,'1',0,[ ]). polygon('black','',16,[ 768,262,768,285,768,290,784,290,792,290,768,352,808,290,912,290, 928,290,928,285,928,262,928,256,912,256,784,256,768,256,768,262],0,1,1,0,4990,0,0,0,0,0,'1',0, "2092",[ ]). box('black','',780,257,932,289,0,1,0,4991,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',856,257,1,1,1,224,15,4993,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,270,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "", 1, 0, 0, text('black',856,265,1,1,1,0,15,4992,13,2,0,0,0,0,2,0,15,0,0,"",0,0,0,0,278,'',[ minilines(0,15,0,0,1,0,0,[ mini_line(0,13,2,0,0,0,[ str_block(0,0,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,0,13,2,0,0,0,1,1,0,0, "")]) ]) ])])) ]). group([ polygon('black','',13,[ 640,336,640,368,640,384,656,384,752,384,768,384,768,368,768,336, 768,320,752,320,656,320,640,320,640,336],0,1,1,0,4998,0,0,0,0,0,'1',0, "2490",[ ]), box('black','',644,324,764,380,0,1,0,4995,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',704,323,1,1,1,224,15,4997,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,336,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "bsfilter", 1, 0, 0, text('black',703,343,1,1,1,39,17,4996,14,3,0,0,0,0,2,39,17,0,0,"",0,0,0,0,357,'',[ minilines(39,17,0,0,1,0,0,[ mini_line(39,14,3,0,0,0,[ str_block(0,39,14,3,0,0,0,0,0,[ str_seg('black','Times-Roman',0,80640,39,14,3,0,0,0,0,0,0,0, "bsfilter")]) ]) ])])) ]) ], 4994,0,0,[ ]). text('black',816,362,1,0,1,26,17,4999,14,3,0,0,0,0,2,26,17,0,0,"",0,0,0,0,376,'',[ minilines(26,17,0,0,0,0,0,[ mini_line(26,14,3,0,0,0,[ str_block(0,26,14,3,0,-1,0,0,0,[ str_seg('black','Times-Roman',0,80640,26,14,3,0,-1,0,0,0,0,0, "POP")]) ]) ])]). poly('black','',2,[ 464,352,624,352],1,1,1,5000,0,0,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]). text('black',528,362,1,0,1,26,17,5001,14,3,0,0,0,0,2,26,17,0,0,"",0,0,0,0,376,'',[ minilines(26,17,0,0,0,0,0,[ mini_line(26,14,3,0,0,0,[ str_block(0,26,14,3,0,-1,0,0,0,[ str_seg('black','Times-Roman',0,80640,26,14,3,0,-1,0,0,0,0,0, "POP")]) ]) ])]). poly('black','',4,[ 640,352,728,392,336,640,536,736],1,1,1,5037,1,0,1,0,0,0,0,'1',0,0, "6","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]). poly('black','',4,[ 768,352,680,392,1072,640,872,736],2,1,1,5057,1,0,1,0,0,0,0,'1',0,0, "6","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]). page(4,"",1,''). group([ polygon('black','',13,[ 192,464,192,496,192,512,208,512,304,512,320,512,320,496,320,464, 320,448,304,448,208,448,192,448,192,464],0,1,1,0,4504,0,0,0,0,0,'1',0, "2490",[ ]), box('black','',196,452,316,508,0,1,0,4501,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',256,451,1,1,1,224,15,4503,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,464,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "POP server", 1, 0, 0, text('black',256,471,1,1,1,66,17,4502,14,3,0,0,0,0,2,66,17,0,0,"",0,0,0,0,485,'',[ minilines(66,17,0,0,1,0,0,[ mini_line(66,14,3,0,0,0,[ str_block(0,66,14,3,0,0,0,0,0,[ str_seg('black','Times-Roman',0,80640,66,14,3,0,0,0,0,0,0,0, "POP server")]) ]) ])])) ]) ], 4500,0,0,[ ]). group([ polygon('black','',13,[ 768,464,768,496,768,512,784,512,880,512,896,512,896,496,896,464, 896,448,880,448,784,448,768,448,768,464],0,1,1,0,4522,0,0,0,0,0,'1',0, "2490",[ ]), box('black','',772,452,892,508,0,1,0,4519,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',832,451,1,1,1,224,15,4521,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,464,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "MUA", 1, 0, 0, text('black',832,471,1,1,1,34,17,4520,14,3,0,0,0,0,2,34,17,0,0,"",0,0,0,0,485,'',[ minilines(34,17,0,0,1,0,0,[ mini_line(34,14,3,0,0,0,[ str_block(0,34,14,3,0,-1,0,0,0,[ str_seg('black','Times-Roman',0,80640,34,14,3,0,-1,0,0,0,0,0, "MUA")]) ]) ])])) ]) ], 4518,0,0,[ ]). poly('black','',2,[ 336,480,752,480],1,1,1,4523,0,0,4,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]). text('black',480,482,1,0,1,113,17,4618,14,3,0,0,0,0,2,113,17,0,0,"",0,0,0,0,496,'',[ minilines(113,17,0,0,0,0,0,[ mini_line(113,14,3,0,0,0,[ str_block(0,113,14,3,0,-1,0,0,0,[ str_seg('black','Times-Roman',0,80640,113,14,3,0,-1,0,0,0,0,0, "POP over SSL/TLS")]) ]) ])]). text('black',336,394,1,0,1,132,17,4626,14,3,0,0,0,0,2,132,17,0,0,"",0,0,0,0,408,'',[ minilines(132,17,0,0,0,0,0,[ mini_line(132,14,3,0,0,0,[ str_block(0,132,14,3,0,-1,0,0,0,[ str_seg('black','Times-Roman',0,80640,132,14,3,0,-1,0,0,0,0,0, "pops.example.com:995")]) ]) ])]). oval('black','',400,416,400,416,1,1,1,4630,0,0,0,0,0,'1',0,[ ]). polygon('black','',16,[ 320,390,320,413,320,418,336,418,344,418,320,480,360,418,464,418, 480,418,480,413,480,390,480,384,464,384,336,384,320,384,320,390],0,1,1,0,4662,0,0,0,0,0,'1',0, "2092",[ ]). box('black','',332,385,484,417,0,1,0,4663,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',408,385,1,1,1,224,15,4664,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,398,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "", 1, 0, 0, text('black',408,393,1,1,1,0,15,4665,13,2,0,0,0,0,2,0,15,0,0,"",0,0,0,0,406,'',[ minilines(0,15,0,0,1,0,0,[ mini_line(0,13,2,0,0,0,[ str_block(0,0,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,0,13,2,0,0,0,1,1,0,0, "")]) ]) ])])) ]). page(5,"",1,''). group([ polygon('black','',13,[ 192,464,192,496,192,512,208,512,304,512,320,512,320,496,320,464, 320,448,304,448,208,448,192,448,192,464],0,1,1,0,4783,0,0,0,0,0,'1',0, "2490",[ ]), box('black','',196,452,316,508,0,1,0,4784,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',256,451,1,1,1,224,15,4785,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,464,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "POP server", 1, 0, 0, text('black',256,471,1,1,1,66,17,4786,14,3,0,0,0,0,2,66,17,0,0,"",0,0,0,0,485,'',[ minilines(66,17,0,0,1,0,0,[ mini_line(66,14,3,0,0,0,[ str_block(0,66,14,3,0,0,0,0,0,[ str_seg('black','Times-Roman',0,80640,66,14,3,0,0,0,0,0,0,0, "POP server")]) ]) ])])) ]) ], 4782,0,0,[ ]). group([ polygon('black','',13,[ 768,464,768,496,768,512,784,512,880,512,896,512,896,496,896,464, 896,448,880,448,784,448,768,448,768,464],0,1,1,0,4788,0,0,0,0,0,'1',0, "2490",[ ]), box('black','',772,452,892,508,0,1,0,4789,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',832,451,1,1,1,224,15,4790,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,464,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "MUA", 1, 0, 0, text('black',832,471,1,1,1,34,17,4791,14,3,0,0,0,0,2,34,17,0,0,"",0,0,0,0,485,'',[ minilines(34,17,0,0,1,0,0,[ mini_line(34,14,3,0,0,0,[ str_block(0,34,14,3,0,-1,0,0,0,[ str_seg('black','Times-Roman',0,80640,34,14,3,0,-1,0,0,0,0,0, "MUA")]) ]) ])])) ]) ], 4787,0,0,[ ]). poly('black','',2,[ 648,480,760,480],1,1,1,4792,0,0,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]). text('black',352,490,1,0,1,113,17,4793,14,3,0,0,0,0,2,113,17,0,0,"",0,0,0,0,504,'',[ minilines(113,17,0,0,0,0,0,[ mini_line(113,14,3,0,0,0,[ str_block(0,113,14,3,0,-1,0,0,0,[ str_seg('black','Times-Roman',0,80640,113,14,3,0,-1,0,0,0,0,0, "POP over SSL/TLS")]) ]) ])]). text('black',336,394,1,0,1,132,17,4794,14,3,0,0,0,0,2,132,17,0,0,"",0,0,0,0,408,'',[ minilines(132,17,0,0,0,0,0,[ mini_line(132,14,3,0,0,0,[ str_block(0,132,14,3,0,-1,0,0,0,[ str_seg('black','Times-Roman',0,80640,132,14,3,0,-1,0,0,0,0,0, "pops.example.com:995")]) ]) ])]). oval('black','',400,416,400,416,1,1,1,4795,0,0,0,0,0,'1',0,[ ]). polygon('black','',16,[ 320,390,320,413,320,418,336,418,344,418,320,480,360,418,464,418, 480,418,480,413,480,390,480,384,464,384,336,384,320,384,320,390],0,1,1,0,4796,0,0,0,0,0,'1',0, "2092",[ ]). box('black','',332,385,484,417,0,1,0,4797,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',408,385,1,1,1,224,15,4798,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,398,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "", 1, 0, 0, text('black',408,393,1,1,1,0,15,4799,13,2,0,0,0,0,2,0,15,0,0,"",0,0,0,0,406,'',[ minilines(0,15,0,0,1,0,0,[ mini_line(0,13,2,0,0,0,[ str_block(0,0,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,0,13,2,0,0,0,1,1,0,0, "")]) ]) ])])) ]). text('black',656,394,1,0,1,90,17,4800,14,3,0,0,0,0,2,90,17,0,0,"",0,0,0,0,408,'',[ minilines(90,17,0,0,0,0,0,[ mini_line(90,14,3,0,0,0,[ str_block(0,90,14,3,0,-1,0,0,0,[ str_seg('black','Times-Roman',0,80640,90,14,3,0,-1,0,0,0,0,0, "localhost:10110")]) ]) ])]). oval('black','',720,416,720,416,1,1,1,4802,0,0,0,0,0,'1',0,[ ]). polygon('black','',16,[ 640,390,640,413,640,418,656,418,664,418,640,480,680,418,784,418, 800,418,800,413,800,390,800,384,784,384,656,384,640,384,640,390],0,1,1,0,4803,0,0,0,0,0,'1',0, "2092",[ ]). box('black','',652,385,804,417,0,1,0,4804,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',728,385,1,1,1,224,15,4805,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,398,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "", 1, 0, 0, text('black',728,393,1,1,1,0,15,4806,13,2,0,0,0,0,2,0,15,0,0,"",0,0,0,0,406,'',[ minilines(0,15,0,0,1,0,0,[ mini_line(0,13,2,0,0,0,[ str_block(0,0,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,0,13,2,0,0,0,1,1,0,0, "")]) ]) ])])) ]). group([ polygon('black','',13,[ 512,464,512,496,512,512,528,512,624,512,640,512,640,496,640,464, 640,448,624,448,528,448,512,448,512,464],0,1,1,0,4816,0,0,0,0,0,'1',0, "2490",[ ]), box('black','',516,452,636,508,0,1,0,4813,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',576,451,1,1,1,224,15,4815,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,464,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "bsfilter", 1, 0, 0, text('black',575,471,1,1,1,39,17,4814,14,3,0,0,0,0,2,39,17,0,0,"",0,0,0,0,485,'',[ minilines(39,17,0,0,1,0,0,[ mini_line(39,14,3,0,0,0,[ str_block(0,39,14,3,0,0,0,0,0,[ str_seg('black','Times-Roman',0,80640,39,14,3,0,0,0,0,0,0,0, "bsfilter")]) ]) ])])) ]) ], 4812,0,0,[ ]). poly('black','',2,[ 328,480,504,480],1,1,1,4823,0,0,4,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]). text('black',688,490,1,0,1,26,17,4833,14,3,0,0,0,0,2,26,17,0,0,"",0,0,0,0,504,'',[ minilines(26,17,0,0,0,0,0,[ mini_line(26,14,3,0,0,0,[ str_block(0,26,14,3,0,-1,0,0,0,[ str_seg('black','Times-Roman',0,80640,26,14,3,0,-1,0,0,0,0,0, "POP")]) ]) ])]). page(6,"",1,''). group([ polygon('black','',13,[ 192,464,192,496,192,512,208,512,304,512,320,512,320,496,320,464, 320,448,304,448,208,448,192,448,192,464],0,1,1,0,4868,0,0,0,0,0,'1',0, "2490",[ ]), box('black','',196,452,316,508,0,1,0,4869,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',256,451,1,1,1,224,15,4870,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,464,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "POP server", 1, 0, 0, text('black',256,471,1,1,1,66,17,4871,14,3,0,0,0,0,2,66,17,0,0,"",0,0,0,0,485,'',[ minilines(66,17,0,0,1,0,0,[ mini_line(66,14,3,0,0,0,[ str_block(0,66,14,3,0,0,0,0,0,[ str_seg('black','Times-Roman',0,80640,66,14,3,0,0,0,0,0,0,0, "POP server")]) ]) ])])) ]) ], 4867,0,0,[ ]). group([ polygon('black','',13,[ 768,464,768,496,768,512,784,512,880,512,896,512,896,496,896,464, 896,448,880,448,784,448,768,448,768,464],0,1,1,0,4873,0,0,0,0,0,'1',0, "2490",[ ]), box('black','',772,452,892,508,0,1,0,4874,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',832,451,1,1,1,224,15,4875,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,464,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "MUA", 1, 0, 0, text('black',832,471,1,1,1,34,17,4876,14,3,0,0,0,0,2,34,17,0,0,"",0,0,0,0,485,'',[ minilines(34,17,0,0,1,0,0,[ mini_line(34,14,3,0,0,0,[ str_block(0,34,14,3,0,-1,0,0,0,[ str_seg('black','Times-Roman',0,80640,34,14,3,0,-1,0,0,0,0,0, "MUA")]) ]) ])])) ]) ], 4872,0,0,[ ]). text('black',336,394,1,0,1,126,17,4879,14,3,0,0,0,0,2,126,17,0,0,"",0,0,0,0,408,'',[ minilines(126,17,0,0,0,0,0,[ mini_line(126,14,3,0,0,0,[ str_block(0,126,14,3,0,-1,0,0,0,[ str_seg('black','Times-Roman',0,80640,126,14,3,0,-1,0,0,0,0,0, "pop.example.com:110")]) ]) ])]). oval('black','',400,416,400,416,1,1,1,4880,0,0,0,0,0,'1',0,[ ]). polygon('black','',16,[ 320,390,320,413,320,418,336,418,344,418,320,480,360,418,464,418, 480,418,480,413,480,390,480,384,464,384,336,384,320,384,320,390],0,1,1,0,4881,0,0,0,0,0,'1',0, "2092",[ ]). box('black','',332,385,484,417,0,1,0,4882,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',408,385,1,1,1,224,15,4883,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,398,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "", 1, 0, 0, text('black',408,393,1,1,1,0,15,4884,13,2,0,0,0,0,2,0,15,0,0,"",0,0,0,0,406,'',[ minilines(0,15,0,0,1,0,0,[ mini_line(0,13,2,0,0,0,[ str_block(0,0,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,0,13,2,0,0,0,1,1,0,0, "")]) ]) ])])) ]). oval('black','',720,416,720,416,1,1,1,4886,0,0,0,0,0,'1',0,[ ]). poly('black','',2,[ 336,480,752,480],1,1,1,4906,0,0,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]). text('black',552,490,1,0,1,26,17,4908,14,3,0,0,0,0,2,26,17,0,0,"",0,0,0,0,504,'',[ minilines(26,17,0,0,0,0,0,[ mini_line(26,14,3,0,0,0,[ str_block(0,26,14,3,0,-1,0,0,0,[ str_seg('black','Times-Roman',0,80640,26,14,3,0,-1,0,0,0,0,0, "POP")]) ]) ])]). page(7,"",1,''). icon([ group([ box('black','',320,266,384,310,2,1,0,2983,0,0,0,0,0,'1',0,[ ]), oval('black','',320,256,384,276,2,1,1,2982,0,0,0,0,0,'1',0,[ ]), arc('black','',2,1,1,0,320,300,352,310,320,310,384,310,0,64,20,11520,11520,2981,0,0,8,3,0,0,0,'1','8','3',0,[ ]), poly('black','',2,[ 320,266,320,310],0,1,1,2980,0,2,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]), poly('black','',2,[ 384,266,384,310],0,1,1,2979,0,2,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]), box('black','',324,281,380,316,0,1,0,2976,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',352,280,1,1,1,224,15,2978,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,293,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "probability", 1, 0, 0, text('black',352,290,1,1,1,60,17,2977,14,3,0,0,0,0,2,60,17,0,0,"",0,0,0,0,304,'',[ minilines(60,17,0,0,1,0,0,[ mini_line(60,14,3,0,0,0,[ str_block(0,60,14,3,0,0,0,0,0,[ str_seg('black','Times-Roman',0,80640,60,14,3,0,0,0,0,0,0,0, "probability")]) ]) ])])) ]) ], 2975,0,0,[ ]), group([ polygon('black','',16,[ 320,208,320,224,320,240,336,240,344,240,344,256,360,240,432,240, 448,240,448,224,448,208,448,192,432,192,336,192,320,192,320,208],0,1,1,0,2974,0,0,0,0,0,'1',0, "2092",[ ]), box('black','',324,196,444,236,0,1,0,2971,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',384,195,1,1,1,224,15,2973,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,208,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "software 20%", 1, 0, 0, text('black',344,199,2,0,1,80,34,2972,14,3,0,0,0,0,2,80,34,0,0,"",0,0,0,0,213,'',[ minilines(80,34,0,0,0,0,0,[ mini_line(80,14,3,0,0,0,[ str_block(0,80,14,3,0,-2,0,0,0,[ str_seg('black','Times-Roman',0,80640,80,14,3,0,-2,0,0,0,0,0, "software 20%")]) ]), mini_line(68,14,3,0,0,0,[ str_block(0,68,14,3,0,-2,0,0,0,[ str_seg('black','Times-Roman',0,80640,68,14,3,0,-2,0,0,0,0,0, "money 99%")]) ]) ])])) ]) ], 2970,0,0,[ ]) ], "database",2969,0,0,0,0,[ ]). icon([ group([ polygon('black','',5,[ 320,400,320,432,384,432,384,400,320,400],0,1,1,0,2990,0,0,0,0,0,'1',0, "00",[ ]), box('black','',324,404,380,428,0,1,0,2987,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',352,403,1,1,1,224,15,2989,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,416,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "new mail", 1, 0, 0, text('black',352,407,1,1,1,52,17,2988,14,3,0,0,0,0,2,52,17,0,0,"",0,0,0,0,421,'',[ minilines(52,17,0,0,1,0,0,[ mini_line(52,14,3,0,0,0,[ str_block(0,52,14,3,0,0,0,0,0,[ str_seg('black','Times-Roman',0,80640,52,14,3,0,0,0,0,0,0,0, "new mail")]) ]) ])])) ]) ], 2986,0,0,[ ]), poly('black','',3,[ 384,400,352,408,320,400],0,1,1,2985,0,0,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]) ], "letter",2984,0,0,0,0,[ ]). group([ polygon('black','',33,[ 512,352,500,356,510,364,497,364,503,375,492,369,492,382,484,372, 480,384,476,372,468,382,468,369,457,375,463,364,450,364,460,356, 448,352,460,348,450,340,463,340,457,329,468,335,468,322,476,332, 480,320,484,332,492,322,492,335,503,329,497,340,510,340,500,348, 512,352],0,1,1,0,2995,0,0,0,0,0,'1',0, "000000000",[ ]), box('black','',452,324,508,380,0,1,0,2992,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',480,323,1,1,1,224,15,2994,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,336,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "spam?", 1, 0, 0, text('black',479,343,1,1,1,37,17,2993,14,3,0,0,0,0,2,37,17,0,0,"",0,0,0,0,357,'',[ minilines(37,17,0,0,1,0,0,[ mini_line(37,14,3,0,0,0,[ str_block(0,37,14,3,0,-1,0,0,0,[ str_seg('black','Times-Roman',0,80640,37,14,3,0,-1,0,0,0,0,0, "spam?")]) ]) ])])) ]) ], 2991,0,0,[ ]). poly('black','',2,[ 400,304,448,320],1,1,1,2996,0,0,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]). poly('black','',2,[ 400,400,448,384],1,1,1,2997,0,0,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]). page(8,"",1,''). icon([ group([ polygon('black','',5,[ 256,144,256,176,320,176,320,144,256,144],0,1,1,0,2867,0,0,0,0,0,'1',0, "00",[ ]), box('black','',260,148,316,172,0,1,0,2864,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',288,147,1,1,1,224,15,2866,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,160,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "clean", 1, 0, 0, text('black',287,151,1,1,1,31,17,2865,14,3,0,0,0,0,2,31,17,0,0,"",0,0,0,0,165,'',[ minilines(31,17,0,0,1,0,0,[ mini_line(31,14,3,0,0,0,[ str_block(0,31,14,3,0,0,0,0,0,[ str_seg('black','Times-Roman',0,80640,31,14,3,0,0,0,0,0,0,0, "clean")]) ]) ])])) ]) ], 2863,0,0,[ ]), poly('black','',3,[ 320,144,288,152,256,144],0,1,1,2862,0,0,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]) ], "letter",2861,0,0,0,0,[ ]). icon([ group([ box('black','',448,330,512,374,2,1,0,2882,0,0,0,0,0,'1',0,[ ]), oval('black','',448,320,512,340,2,1,1,2881,0,0,0,0,0,'1',0,[ ]), arc('black','',2,1,1,0,448,364,480,374,448,374,512,374,0,64,20,11520,11520,2880,0,0,8,3,0,0,0,'1','8','3',0,[ ]), poly('black','',2,[ 448,330,448,374],0,1,1,2879,0,2,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]), poly('black','',2,[ 512,330,512,374],0,1,1,2878,0,2,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]), box('black','',452,345,508,380,0,1,0,2875,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',480,344,1,1,1,224,15,2877,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,357,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "spam", 1, 0, 0, text('black',479,354,1,1,1,31,17,2876,14,3,0,0,0,0,2,31,17,0,0,"",0,0,0,0,368,'',[ minilines(31,17,0,0,1,0,0,[ mini_line(31,14,3,0,0,0,[ str_block(0,31,14,3,0,0,0,0,0,[ str_seg('black','Times-Roman',0,80640,31,14,3,0,0,0,0,0,0,0, "spam")]) ]) ])])) ]) ], 2874,0,0,[ ]), group([ polygon('black','',16,[ 448,272,448,288,448,304,464,304,472,304,472,320,488,304,560,304, 576,304,576,288,576,272,576,256,560,256,464,256,448,256,448,272],0,1,1,0,2873,0,0,0,0,0,'1',0, "2092",[ ]), box('black','',452,260,572,300,0,1,0,2870,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',512,259,1,1,1,224,15,2872,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,272,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "software 5", 1, 0, 0, text('black',481,263,2,0,1,61,34,2871,14,3,0,0,0,0,2,61,34,0,0,"",0,0,0,0,277,'',[ minilines(61,34,0,0,0,0,0,[ mini_line(61,14,3,0,0,0,[ str_block(0,61,14,3,0,-1,0,0,0,[ str_seg('black','Times-Roman',0,80640,61,14,3,0,-1,0,0,0,0,0, "software 5")]) ]), mini_line(56,14,3,0,0,0,[ str_block(0,56,14,3,0,-1,0,0,0,[ str_seg('black','Times-Roman',0,80640,56,14,3,0,-1,0,0,0,0,0, "money 99")]) ]) ])])) ]) ], 2869,0,0,[ ]) ], "database",2868,0,0,0,0,[ ]). icon([ group([ polygon('black','',5,[ 256,336,256,368,320,368,320,336,256,336],0,1,1,0,2889,0,0,0,0,0,'1',0, "00",[ ]), box('black','',260,340,316,364,0,1,0,2886,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',288,339,1,1,1,224,15,2888,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,352,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "spam", 1, 0, 0, text('black',287,343,1,1,1,31,17,2887,14,3,0,0,0,0,2,31,17,0,0,"",0,0,0,0,357,'',[ minilines(31,17,0,0,1,0,0,[ mini_line(31,14,3,0,0,0,[ str_block(0,31,14,3,0,0,0,0,0,[ str_seg('black','Times-Roman',0,80640,31,14,3,0,0,0,0,0,0,0, "spam")]) ]) ])])) ]) ], 2885,0,0,[ ]), poly('black','',3,[ 320,336,288,344,256,336],0,1,1,2884,0,0,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]) ], "letter",2883,0,0,0,0,[ ]). icon([ group([ box('black','',448,138,512,182,2,1,0,2904,0,0,0,0,0,'1',0,[ ]), oval('black','',448,128,512,148,2,1,1,2903,0,0,0,0,0,'1',0,[ ]), arc('black','',2,1,1,0,448,172,480,182,448,182,512,182,0,64,20,11520,11520,2902,0,0,8,3,0,0,0,'1','8','3',0,[ ]), poly('black','',2,[ 448,138,448,182],0,1,1,2901,0,2,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]), poly('black','',2,[ 512,138,512,182],0,1,1,2900,0,2,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]), box('black','',452,153,508,188,0,1,0,2897,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',480,152,1,1,1,224,15,2899,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,165,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "clean", 1, 0, 0, text('black',479,162,1,1,1,31,17,2898,14,3,0,0,0,0,2,31,17,0,0,"",0,0,0,0,176,'',[ minilines(31,17,0,0,1,0,0,[ mini_line(31,14,3,0,0,0,[ str_block(0,31,14,3,0,0,0,0,0,[ str_seg('black','Times-Roman',0,80640,31,14,3,0,0,0,0,0,0,0, "clean")]) ]) ])])) ]) ], 2896,0,0,[ ]), group([ polygon('black','',16,[ 448,80,448,96,448,112,464,112,472,112,472,128,488,112,560,112, 576,112,576,96,576,80,576,64,560,64,464,64,448,64,448,80],0,1,1,0,2895,0,0,0,0,0,'1',0, "2092",[ ]), box('black','',452,68,572,108,0,1,0,2892,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',512,67,1,1,1,224,15,2894,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,80,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "software 20", 1, 0, 0, text('black',478,71,2,0,1,68,34,2893,14,3,0,0,0,0,2,68,34,0,0,"",0,0,0,0,85,'',[ minilines(68,34,0,0,0,0,0,[ mini_line(68,14,3,0,0,0,[ str_block(0,68,14,3,0,-1,0,0,0,[ str_seg('black','Times-Roman',0,80640,68,14,3,0,-1,0,0,0,0,0, "software 20")]) ]), mini_line(49,14,3,0,0,0,[ str_block(0,49,14,3,0,-1,0,0,0,[ str_seg('black','Times-Roman',0,80640,49,14,3,0,-1,0,0,0,0,0, "money 1")]) ]) ])])) ]) ], 2891,0,0,[ ]) ], "database",2890,0,0,0,0,[ ]). icon([ group([ box('black','',640,330,704,374,2,1,0,2919,0,0,0,0,0,'1',0,[ ]), oval('black','',640,320,704,340,2,1,1,2918,0,0,0,0,0,'1',0,[ ]), arc('black','',2,1,1,0,640,364,672,374,640,374,704,374,0,64,20,11520,11520,2917,0,0,8,3,0,0,0,'1','8','3',0,[ ]), poly('black','',2,[ 640,330,640,374],0,1,1,2916,0,2,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]), poly('black','',2,[ 704,330,704,374],0,1,1,2915,0,2,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]), box('black','',644,345,700,380,0,1,0,2912,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',672,344,1,1,1,224,15,2914,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,357,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "probability", 1, 0, 0, text('black',672,354,1,1,1,60,17,2913,14,3,0,0,0,0,2,60,17,0,0,"",0,0,0,0,368,'',[ minilines(60,17,0,0,1,0,0,[ mini_line(60,14,3,0,0,0,[ str_block(0,60,14,3,0,0,0,0,0,[ str_seg('black','Times-Roman',0,80640,60,14,3,0,0,0,0,0,0,0, "probability")]) ]) ])])) ]) ], 2911,0,0,[ ]), group([ polygon('black','',16,[ 640,272,640,288,640,304,656,304,664,304,664,320,680,304,752,304, 768,304,768,288,768,272,768,256,752,256,656,256,640,256,640,272],0,1,1,0,2910,0,0,0,0,0,'1',0, "2092",[ ]), box('black','',644,260,764,300,0,1,0,2907,0,0,0,0,0,'1',0,[ attr("", "auto_center_attr", 0, 1, 0, text('black',704,259,1,1,1,224,15,2909,13,2,0,0,0,0,2,224,15,0,0,"",0,0,0,0,272,'',[ minilines(224,15,0,0,1,0,0,[ mini_line(224,13,2,0,0,0,[ str_block(0,224,13,2,0,0,0,0,0,[ str_seg('black','Ryumin-Light-EUC-H',0,80640,224,13,2,0,0,0,1,1,0,0, "auto_center_attr")]) ]) ])])), attr("label=", "software 20%", 1, 0, 0, text('black',664,263,2,0,1,80,34,2908,14,3,0,0,0,0,2,80,34,0,0,"",0,0,0,0,277,'',[ minilines(80,34,0,0,0,0,0,[ mini_line(80,14,3,0,0,0,[ str_block(0,80,14,3,0,-2,0,0,0,[ str_seg('black','Times-Roman',0,80640,80,14,3,0,-2,0,0,0,0,0, "software 20%")]) ]), mini_line(68,14,3,0,0,0,[ str_block(0,68,14,3,0,-2,0,0,0,[ str_seg('black','Times-Roman',0,80640,68,14,3,0,-2,0,0,0,0,0, "money 99%")]) ]) ])])) ]) ], 2906,0,0,[ ]) ], "database",2905,0,0,0,0,[ ]). poly('black','',2,[ 352,352,432,352],1,1,1,2920,0,0,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]). poly('black','',2,[ 352,160,432,160],1,1,1,2922,0,0,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]). poly('black','',2,[ 576,160,624,320],1,1,1,2923,0,0,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]). text('black',352,178,1,0,1,67,17,2924,14,3,0,0,0,0,2,67,17,0,0,"",0,0,0,0,192,'',[ minilines(67,17,0,0,0,0,0,[ mini_line(67,14,3,0,0,0,[ str_block(0,67,14,3,0,0,0,0,0,[ str_seg('black','Times-Roman',0,80640,67,14,3,0,0,0,0,0,0,0, "--add-clean")]) ]) ])]). text('black',352,370,1,0,1,67,17,2925,14,3,0,0,0,0,2,67,17,0,0,"",0,0,0,0,384,'',[ minilines(67,17,0,0,0,0,0,[ mini_line(67,14,3,0,0,0,[ str_block(0,67,14,3,0,0,0,0,0,[ str_seg('black','Times-Roman',0,80640,67,14,3,0,0,0,0,0,0,0, "--add-spam")]) ]) ])]). text('black',544,370,1,0,1,49,17,2926,14,3,0,0,0,0,2,49,17,0,0,"",0,0,0,0,384,'',[ minilines(49,17,0,0,0,0,0,[ mini_line(49,14,3,0,0,0,[ str_block(0,49,14,3,0,0,0,0,0,[ str_seg('black','Times-Roman',0,80640,49,14,3,0,0,0,0,0,0,0, "--update")]) ]) ])]). poly('black','',2,[ 544,352,624,352],1,1,1,2927,0,0,0,0,0,0,0,'1',0,0, "0","",[ 0,8,3,0,'8','3','0'],[0,8,3,0,'8','3','0'],[ ]). nbkenichi-bsfilter-f0a5a7c/docs/imap.png000066400000000000000000000030341465373635000203440ustar00rootroot00000000000000PNG  IHDR)PLTEٟIDATx_h#Ez(jWPMh_z{AŻEw9i-("G B(PyzZRF|:36*4Z%DDz"JZ%L~;XYzz e΄=;d'Mvr,Ѧi;3+5nFad,Ml9c;/Vvv_J"wZE"~=ZZgÿj/Y/s+,4vm|מ}]۱wIJ4l ۩Gu-'#VJ޽Md=[ezk=qXKӐ *]d0Bf&,)L[WWh AdmCOn .!'7f"]Ҁ.HFۊS,;!Gdt;, VO CNjn.ኺ.*Lo2}azviR/DbBIP:qNJ2tIIʼW,EOtsՋ/LeK7㕻׬{nV(ʼf 4YFd9FKBQY}֯"#+gk5/FKzIyyϕdLD.SAN'吾1>>x \<'4VQurAV +萕}8^C+x\Zkx\Ecd5vk 9$+w6QARl_S._%:)Ҍ;BZ]sRb0p>9HD2})5.qU'LpIENDB`nbkenichi-bsfilter-f0a5a7c/docs/index-e.html000066400000000000000000000157151465373635000211400ustar00rootroot00000000000000 bsfilter / bayesian spam filter

bsfilter / bayesian spam filter

japanses english

0. What is bsfilter ?

  • a filter which distinguishes spam and non-spam(called "clean" in this page) mail
  • support mails written in japanese language
  • written in Ruby
  • support 3 methods for access
    • traditional unix-style filter. study and judge local files or pipe
    • IMAP. study and judge mails in an IMAP server. IMAP over SSL supported
    • POP proxy. run between POP server and MUA. POP over SSL supported
  • basic concepts come from A Plan for Spam, Better Bayesian Filtering, Spam Detection
  • distributed under GPL

1. Contents

2. Download/Install

2.1. UNIX

Install ruby interpreter. Put bsfilter/bsfilter at a directory in your executable path. On some OSs or distributions, you may use a package like ports or ebild.

3. How bsfilter works?

3.1. using spam proability of each token

using spam proability of each token

3.2. need to prepare

need to prepare

4. Let's get started

preprare

It is necessary to prepare databases before filtering

1. count tokens in clean mails

% bsfilter --add-clean ~/Mail/inbox/*

2. count tokens in spam

% bsfilter --add-spam ~/Mail/spam/*

3. calculate spam probability for each token

% bsfilter --update

filtering

example: specify filenames for filtering as command line argumetns. spam probability numbers(between 0 and 1) are displayed.

% bsfilter ~/Mail/inbox/1
combined probability /home/nabeken/Mail/inbox/1 1 0.012701

example: feed mail for filtering through stdin pipe. exit status is 0 in case of spam

~% bsfilter < ~/Mail/inbox/1 ; echo $status
1
~% bsfilter < ~/Mail/spam/1 ; echo $status
0

procmail sample recipe 1: move spams to spam folder using exit status

:0 HB:
* ? bsfilter -a
spam/.

procmail sample recipe 1: add X-Spam-Flag:, X-Spam-Probability: headers and move spams to black or gray folder based on spam probability at X-Spam-Probability: header

:0 fw
| /home/nabeken/bin/bsfilter --pipe --insert-flag --insert-probability

:0
* ^X-Spam-Probability: *(1|0\.[89])
black/.

:0
* ^X-Spam-Probability: *0\.[67]
gray/.

5. Help

issues

6. Usage

formats of command line

there are 2 formats.

  1. bsfilter [options] [commands] < MAIL
  2. bsfilter [options] [commands] MAIL ...

There are maintenance mode and filtering mode.

  • When commands are spcecified, bsfilter is under maintenance mode. It updates databases, but doesn't judge mails.
  • When commands aren't specified, bsfilter is under filtering mode. It judges mails, but doen't update databases.
  • There is an exception. When "auto-update" is specified in filtering mode, bsfilter updates databases and judge mails also.

Use format 1 in filtering mode in order to feed mail from stdin and judge it. Exit status becomes 0 in case of spam. If bsfilter invoked by MDA(procmailrc and etc), this style are used.

Use format 2 in filtering mode in order to specify multiple mails at command line and judge at once. Results are displayed at stdout.

type "bsfilter --help" to see all commands and options.

7. Usage(IMAP)

bsfilter is able to communicate server by IMAP and study or judge mails stored in it. bsfilter is able to insert headers or move mails to a specified folder

communicate with IAMP server

example

sample of bsfilter.conf

imap-server server.example.com
imap-auth login
imap-user hanako
imap-password open_sesame

judge mails without X-Spam-Flag in "inbox", insert X-Spam-Probability header and move spams into "inbox.spam"

% bsfilter --imap --imap-fetch-unflagged --insert-flag --insert-probability --imap-folder-spam inbox.spam inbox

8. Usage(POP proxy)

bsfilter ia able to work as POP proxy and judge mails and insert headers on a path from POP server to MUA. "--auto-update" is a valid option but "--add-clean" and "--add-spam" are not.

Let's assume that POP server is running on pop.example.com using port 110.

POP without bsfilter

bsfilter is runing as POP. POP is used between the server and bsfilter and between bsfilter and MUA. bsfilter judge mails and insert headers. In this case, use the following options. Because default of --pop-port and --pop-poryx-port are 110 and 10110, they are able to be omitted.

% bsfilter --pop --auto-update --insert-flag --insert-probability --pop-server pop.example.com --pop-port 110 --pop-proxy-port 10110

POP with bsfilter

When pops.exmaple.com uses POP over SSL, POP over SSL is used between the server and bsfilter, POP is used between bsfilter and MUA.(after 1.67.2.*)

% bsfilter --ssl --pop --auto-update --insert-flag --insert-probability --pop-server pops.example.com --pop-port 995 --pop-proxy-port 10110

POP with bsfilter(SSL)

nbkenichi-bsfilter-f0a5a7c/docs/index.html000066400000000000000000001020611465373635000207050ustar00rootroot00000000000000 bsfilter / bayesian spam filter / ベイジアン スパム フィルタ

bsfilter / bayesian spam filter / ベイジアン スパム フィルタ

japanses english

0. bsfilterとは

  • spamと、そうではない正当なメール(このページではcleanと呼ぶ)を自動判定するフィルタ
  • 日本語に対応している
  • rubyで書かれている
  • 3種類の使用方法がある
    • ~/.procmailrc等から呼び出し、ファイル(==メール)を学習、判定する
    • IMAPにより、サーバ内のメールを学習、判定する。IMAP over SSLもサポート
    • POP proxyとして動作し、MUAが受信する際に判定する。POP over SSLもサポート
  • 動作はスパムへの対策 ---A Plan for Spamベイジアンフィルタの改善 --- Better Bayesian FilteringSpam Detection を基本としている
  • GPLのもと、配布されている

1. 目次

2. ダウンロード/インストール

2.1. UNIX系の場合

rubyインタープリタをインストールし、アーカイブの中のbsfilter/bsfilterをPATHが通っている適当なディレクトリに置く。 OS、ディストリビューションによっては、ports、ebuild等のパッケージが用意されている場合もある。

3. ざっくり、どうなっているの?

3.1. 単語(token)のspam確率から判定する

あらかじめtokenごとのspam確率を求めておき、メールの中にspam確率の高いtokenが多く含まれていたら、 そのメールがspamであると判定する。

spam確率から判定

3.2. 準備が必要

実際に使用を開始する前に、spam確率を求めるという準備が必要である。 判定済のメール内のtoken出現回数を求め、spam / (clean + spam)により確率を求める。 --add-clean, --add-spam, --updateというコマンドで準備を行う。

準備が必要

4. やってみよう

準備

実際に判定する前に、データベースを準備する必要がある。

1. cleanなメール中の単語を数える。

% bsfilter --add-clean ~/Mail/inbox/*

2. spam中の単語を数える。

% bsfilter --add-spam ~/Mail/spam/*

3. 単語ごとにclean/spamな確率を求める。

% bsfilter --update

以上で、準備終了。

判定

コマンドライン引数で判定対象のメールを指定する例。0から1の範囲で、spam確率が表示される。

% bsfilter ~/Mail/inbox/1
combined probability /home/nabeken/Mail/inbox/1 1 0.012701

標準入力から判定対象のメールを入力する例。spamの場合、exit statusが0になる。

~% bsfilter < ~/Mail/inbox/1 ; echo $status
1
~% bsfilter < ~/Mail/spam/1 ; echo $status
0

~/.procmailrcから呼ぶときのサンプルレシピその1。 exit statusを利用して、spamをspamフォルダに移動する。

:0 HB:
* ? bsfilter -a
spam/.

~/.procmailrcから呼ぶときのサンプルレシピその2。spamに X-Spam-Flag:X-Spam-Probability:ヘッダを追加し、 X-Spam-Probabilityの示す確率に従い、blackフォルダ、grayフォルダに移動する。

:0 fw
| /home/nabeken/bin/bsfilter --pipe --insert-flag --insert-probability

:0
* ^X-Spam-Probability: *(1|0\.[89])
black/.

:0
* ^X-Spam-Probability: *0\.[67]
gray/.

5. ヘルプ

issuesへどうぞ。

6. 使い方

コマンドラインの形式

以下の2通りの形式がある。

  1. bsfilter [options] [commands] < MAIL
  2. bsfilter [options] [commands] MAIL ...

maintenance modeとfiltering modeがある。

  • commandsが指定されているときは、maintenance modeで動作し、データベースの更新を行なうが、spamの判定は行わない。
  • commandsが指定されていないときは、filtering modeで動作し、spamの判定を行なうが、データベースの更新は行わない。
  • filtering modeで--(synchronous-)auto-updateが指定されているときは例外で、spamの判定、データベースの更新、両方を行う。

標準入力からメールを与え、spamかどうか判定させるには、filtering modeで1の形式を使用する。 spamの場合は、exit statusが0になる。 ~/.procmailrcから呼び出す場合は、通常、この形式になる。

コマンドライン引数でメールのファイルネームを与え、spamかどうか判定させる際には、filtering modeで2の形式を使用する。 複数のメールを同時に判定することが出来る。結果は標準出力に表示される。

コマンド一覧

--add-clean
-c
cleanなメール中のtokenをデータベースに足す。
--add-spam
-s
spam中のtokenをデータベースに足す。
--sub-clean
-C
cleanなメール中のtokenをデータベースから引く。
--sub-spam
-S
spam中のtokenをデータベースから引く。
--update
-u
tokenごとのclean/spam確率が入っているデータベースを更新する。-c、-s等と同時に使われた場合は、メール中のtokenについてのみ、確率データベースを更新する。
--export-clean
cleanなtokenをデータベースから独自フォーマットのテキストにエクスポートする。
--export-spam
spamなtokenをデータベースから独自フォーマットのテキストにエクスポートする。
--import-clean
cleanなtokenをデータベースへ独自フォーマットのテキストからインポートする。
--import-spam
spamなtokenをデータベースへ独自フォーマットのテキストからインポートする。
--export-probability
確率データベースをエクスポートする。デバッグ用の機能であり、インポートは出来ない。

複数のコマンドを同時に指定できる。

オプション一覧

--homedir directory
データベース、ロックファイルを作るホームディレクトリを指定する。 --homedirオプションが使用されなかった場合は、 BSFILTERHOME環境変数で設定されたディレクトリを使用する。 BSFILTERHOME環境変数が設定されていない場合は、~/.bsfilterを使用する。 HOME環境変数が設定されていない場合は、bsfilterのあるディレクトリを使用する。
--config-file filename
設定ファイルを指定する。デフォルトでは、上記bsfilterホームディレクトリの bsfilter.confが使用される。
## example of bsfilter.conf
jtokenizer MeCab
spam-cutoff 0.6
--max-line number
先頭から指定した行数の範囲を、判定、学習の対象にする。
--db sdbm|gdbm|bdb1|bdb|qdbm
データベースの形式を指定する。デフォルトはsdbm。bdb, qdbmのサポートは1.0.8から。
--jtokenizer bigram|block|mecab|chasen|kakasi
-j bigram|block|mecab|chasen|kakasi
日本語のメールからtokenを切り出す際のアルゴリズムを指定する。 bigram(連続する漢字2文字で1token)、 block(連続する漢字全部で1token)、 MeCab(和布蕪)ChaSenKAKASIを呼び出す方法をサポートしている。 デフォルトはbigram。MeCab, ChaSen, KAKASIを使用するには、 あらかじめrubyバインディングをインストールしておく必要がある。
--method g|r|rf
-m g|r|rf
Paul Graham提案の方式に准ずるか(g)、 Gary Robinson提案の方式に准ずるか(r)、 Gary Robinson-Fisher方式に准ずるか(rf)を指定する。 デフォルトはGary Robinson-Fisher方式。
--spam-cutoff number
spamの確率がいくつ以上だったら、spamと判定するかを指定する。 デフォルトは、Paul Graham方式で0.9、Gary Robinson方式で0.582、Robinson-Fisher方式で0.95。
--auto-update
-a
メールがcleanかspamか判定し、その結果に基づいてメール中のtokenをデータベースに足し、 確率データベースも更新する。
--disable-degeneration
-D
確率データベースを引く際の還元を抑止する。
--disable-utf-8
utf-8サポートを抑止する。
--refer-header header[,header...]
参照するヘッダを指定する。複数指定する場合は、スペースを入れずにコンマで区切り羅列する。デフォルトは "Ufrom,From,To,Cc,Subject,Reply-to,Return-path,Received,Content-Transfer-Encoding,Content-Type,charset,Content-Disposition" を指定したのと同じ。
--ignore-header
-H
メールヘッダを無視する。--refer-header ""を指定したのと同じ。
--ignore-body
-B
URL、メールアドレス以外のメール本文を無視する。
--ignore-plain-text-part
htmlのパートがある場合、plain textのパートを無視する。
--ignore-after-last-atag
最後のA end tag以降を無視する。
--mark-in-token "characters"
token中に現れてもよい(== tokenの区切りにならない)記号を指定する。 デフォルトでは、"*'!"が設定されている。
--show-new-token
新たにデータベースに追加されたtokenを表示する。
--show-process
動作の様子を表示する。 左のカラムから、プロトコル、言語、判定結果、学習コマンド、日時、message-idを意味する。
--show-db-status
データベースの状況を表示して終了する。 左のカラムから"db"(固定)、言語、clean tokenの数、clean mailの数、spam tokenの数、spam mailの数、確率データベース内のtoken数を意味する。
--mbox
mbox形式をサポート。ひとつのファイルを、"unix from"で区切られた複数のメールとして扱う。
--max-mail number
token database中のメールの数がこの値を越えたら、token databaseを縮小する。デフォルトは10000。
--min-mail number
token database中のtoken登場回数に、(min-mailで指定した数/max-mailで指定した数)を掛けることにより、token databaseの縮小を行う。 デフォルトは8000。
--pipe
メールをstdoutに出力する。
--insert-flag
stdoutに出力するメールのヘッダに"X-Spam-Flag: Yes"か"X-Spam-Flag: No"を追加する。 filtering modeの際には判定結果に基づき、maintenance modeの際にはコマンドに基づき、追加する。
--insert-probability
stdoutに出力するメールのヘッダに、"X-Spam-Probability: number"を追加する。filtering modeでのみ有効。
--insert-revision
stdoutに出力するメールのヘッダに、"X-Spam-Revision: bsfilter..."を追加する。filtering modeでのみ有効。
--header-prefix string
メールのヘッダに、"X-Spam-..."ではなく、"X-指定した文字列-..."を追加する。
--mark-spam-subject
spamの場合、メールのSubjectヘッダに"[SPAM] "を挿入する。
--spam-subject-prefix string
メールのSubjectヘッダに"[SPAM] "ではなく、指定した文字列を挿入する。
--list-clean
cleanと判定されたファイルネームを表示する。判定対象のメールをコマンドライン引数で与えた場合のみ有効。
--list-spam
spamと判定されたファイルネームを表示する。判定対象のメールをコマンドライン引数で与えた場合のみ有効。
--help
-h
ヘルプを表示する。
--revision
リビジョンを表示する。
--verbose
-v
メッセージを多めに表示する。
--debug
-d
デバッグ用のメッセージを表示する。

7. 使い方(IMAP編)

IMAPでサーバと通信し、サーバに保存しているメールを学習、判定することができる。 判定結果に従い、ヘッダを挿入すること、指定したフォルダにメールを移動するすることが可能である。

IAMPサーバと通信

IMAPに関するオプション

--imap
IMAPを使用する場合に指定する。必須。
--imap-server hostname
IMAPサーバを指定する。必須。
--imap-port number
IMAPサーバが使用しているポート番号を指定する。デフォルトは143。
--imap-auth cram-md5|login|loginc|auto
認証方法を指定する。 cram-md5(AUTHENTICATE CRAM-MD5命令)、 login(AUTHENTICATE LOGIN命令)、 loginc(LOGIN命令)、 auto(cram-md5、login、logincの順で適当な方法を選択)が指定可能。デフォルトはauto。
--imap-user user_name
IMAPサーバでのユーザネームを指定する。必須。
--imap-password password
IMAPサーバのpasswordを指定する。必須。
--imap-folder-clean folder_name
cleanとして学習したメール、cleanとして判定されたメールを移動する先のフォルダを指定する。
--imap-folder-spam folder_name
spamとして学習したメール、spamとして判定されたメールを移動する先のフォルダを指定する。
--imap-fetch-unseen
SEENフラグが付いていないメールのみを学習、判定の対象とするとき使用する。
--imap-fetch-unflagged
X-Spam-Flagヘッダが付いていないメールのみを学習、判定の対象とするとき使用する。
--imap-reset-seen-flag
bsfilterがヘッダを挿入したメール、別のフォルダに移動したメールのSEENフラグをリセットする。
--ssl
--imap-serverで指定したサーバとの間でIMAP over SSLを使用する。
--ssl-cert filename|dirname
IMAP over SSLの際に使用する証明書を指定する。

--imapを使用している場合に無効なオプション

--pipeは無効である。

使用例

bsfilter.confの例

imap-server imap.example.com
imap-auth login
imap-user hanako
imap-password open_sesame

inboxの中のX-Spam-Flagがついていないメールを判定し、判定結果をヘッダに入れ、spamの場合はinbox.spamに移動する例

% bsfilter --imap --imap-fetch-unflagged --insert-flag --insert-probability --imap-folder-spam inbox.spam inbox

8. 使い方(POP proxy編)

POP proxyとして動作し、MUAがサーバからPOPで受信する経路の途中で、メールを判定しヘッダを挿入することが可能である。--auto-updateによる自動学習は行えるが、--add-clean、--add-spamによる学習は行えない。

pop.example.comのポート110でPOPサーバが動いているとする。

bsfilterなしのPOP

bsfilterがPOP proxyとして動作する。サーバとbsfilter間、bsfilterとMUA間はPOPを使用する。bsfilter内部で、メールを判定しヘッダを挿入する。この場合は、以下のようオプションを設定する(--pop-portのデフォルトは110、--pop-poryx-portのデフォルトは10110なので、省略可能)。

% bsfilter --pop --auto-update --insert-flag --insert-probability --pop-server pop.example.com --pop-port 110 --pop-proxy-port 10110

POP proxyとして動作

pops.exmaple.comがPOP over SSLを使用している場合は、サーバとbsfilter間はPOP over SSL、bsfilterとMUA間はPOPを使用する。

% bsfilter --ssl --pop --auto-update --insert-flag --insert-probability --pop-server pops.example.com --pop-port 995 --pop-proxy-port 10110

POP proxyとして動作(SSL)

POPに関するオプション

--pop
POP proxyとして動作させる場合に指定する。必須。
--tasktray
windowsでVisualuRuby使用時、タスクトレイに常駐させる。
--pop-server hostname
POPサーバを指定する。
--pop-port number
POPサーバが使用しているポート番号を指定する。デフォルトは110。
--pop-proxy-if address
bsfilterがメイラからのリクエストを受け取るインターフェイスのアドレスを指定する。 デフォルトは0.0.0.0で全てのインターフェイスから受け取る。
--pop-proxy-port number
bsfilterがメイラからのリクエストを受け取るポート番号を指定する。デフォルトは10110。
--pop-user name
特定のユーザのみ使用を許可する場合に使用する。POPサーバでのユーザネームを指定する。
--pop-proxy-set set[,set...]
POP proxyのルールを記述する。複数記述できる。--pop-server, --pop-port, --pop-proxy-port, --pop-userオプションの代りに使用する。 ルールのフォーマットは"pop-server:[pop-port]:[proxy-interface]:proxy-port[:pop-user]"
pop-server
POPサーバを指定する。
pop-port
POPサーバが使用しているポート番号を指定する。省略時は110。
proxy-interface
bsfilterがメイラからのリクエストを受け取るインターフェイスを指定する。省略時は全てのインターフェイスから受け取る。
proxy-port
bsfilterがメイラからのリクエストを受け取るポート番号を指定する。
pop-user
特定のユーザのみ使用を許可する場合に使用する。POPサーバでのユーザネームを指定する。
--pop-max-size number
指定したバイト数以上のメールについては処理をバイパスし、判定、学習の対象にしない。 0を指定すると、全てのファイルを判定、学習の対象にする。デフォルトは50000(50Kbytes)。
--pid-file filename
Process IDを記録するファイルネームを指定する。デフォルトは、bsfilterホームディレクトリのbsfilter.pid。
--ssl
--pop-serverで指定したサーバとの間でPOP over SSLを使用する。
--ssl-cert filename|dirname
POP over SSLの際に使用する証明書を指定する。

--popを使用している場合に無効なオプション

--pipe、及び、--add-clean等、学習に関するコマンド、オプションは無効である。

9. FAQ

Q. IMAP over SSLの際に証明書を指定すると異常終了する

ruby標準ライブラリのnet/imap.rbの問題。ruby-1.8.3以降なら問題ないはず

Q. in `get_all_responses': Mailbox does not exist, or must be subscribed to. (Net::IMAP::NoResponseError)というエラーで落ちるのだけど

メールボックスの名前が間違っていると思われる。ruby-1.8のimap.rbで以下のようにして調べるとよい。

% ruby /usr/local/lib/ruby/1.8/net/imap.rb --user=taro --auth=login imap.example.com
password: ******
taro@imap.example.com> list
 INBOX.junk
 INBOX

Q. windowsで使いたいのだけど?

mswin32版、cygwin版等、既にRubyがインストール済みの場合は、bsfilterのソースを普通に実行すればよい。

Q. exit statusはどのようになっている?

通常は0。--pipeオプションなしで判定対象メールを標準入力から与えた場合のみ、判定結果をexit statusから得ることが出来る。 その場合、spamは0、cleanは1というexit statusになる。

Q. No such file to load -- sdbmというエラーが出る。

sdbmを使用可能にする。例えばdebianの場合は、libsdbm-rubyをインストールする。

Q. mewと一緒に使うには?

bsfilter with mewを参照。

Q. Wanderlustから使うには?

やまだあきらさんがwl-bsfilter.elを公開中。 \ay diaryRe: Mark & Action (Re: 2種類の削除コマンド)から。

Q. 日本語の取り扱いは?

メールが日本語か否かをad hocに判断し、日本語らしき場合はnkf.soでEUC に変換している。Subjectや本文でISO-2022-JPと名乗りつつ、実際のコードは shift jisというケースも、nkfが救ってくれている模様。iconvが使えれば、unicodeもサポート。

各種データベースは、日本語用とその他用に分けてある。

デフォルトでは連続する漢字2文字(bigram)、カタカナをtokenとして扱っている。 MeCab(和布蕪)による形態素解析を行うには、MeCabとrubyバインディングをインストールし、 --jtokenizer MeCabを指定すればよい。ChaSen, KAKASIについても同様。

Q. -aオプションを使用したら、cleanなメールをspamと誤判定された。誤判定に基づき更新されたデータベースに対する対処は?

誤判定されたメールが~/Mail/spam/123であるなら、

% bsfilter --sub-spam --add-clean --update ~/Mail/spam/123 

とする。~/Mail/spam/123をspam token databaseから削除し、 clean token databaseに追加し、確率データベースを更新する。

Q. 過去に受け取ったspamを保存していないので、spam token databaseが作れない。どうればよい?

TLEC presents spamassassin を fetchmail から利用する方法 で紹介されている、 Linux-usersメイリングリストのspamをかき集めたものを利用するのが便利。

FreeBSD ports-jp MLの以下のアーカイブはspamのみ。

Q. Paul Graham方式、Gary Robinson方式、Gary Robinson-Fisher方式を切り替えるには?

Paul Graham方式と、Gary Robinson、Gary Robinson-Fisher方式との間で切り替えるには、 -uでspam確率データベースを更新するところからやり直す。spam確率データベースを更新するとき(maintenance mode)と、 メールをspamか判定させるとき(filtering mode)とで、同じ方式を-mオプションで指定する必要がある。

Gary Robinson方式と、Gary Robinson-Fisher方式の間で切り替えるときには、-uは不要である。

-c、-sでtokenを数えるところは方式に依存していないので、どの場合もやり直す必要はない。

Q. Paul Graham方式とGary Robinson方式の違いは?

tokenごとのspam確率を求める方法、それを使用してメールのspam確率を求める方法、両方とも異なる。

Paul Graham方式でのtokenごとのspam確率をpg(w)とする。pg(w)を求める際の特徴を挙げる。

  • cleanな方向にバイアスをかけるために、cleanなメール中での実際のtoken出現回数を2倍してから、計算する。
  • 過去に出会ったことのないtokenのpg(w)は、0.4とする。

Gary Robinson方式でのtokenごとのspam確率をf(w)とする。以下のように求める。

  • token出現回数にバイアスをかけずに、tokenごとのspam確率p(w)を求める。
  • 全tokenでのp(w)の平均値をrobx、tokenの出現回数をn、ある定数(例えば0.001)をrobsとして、
    f(w) = ((s * robx) + (n * p(w))) / (robs + n)
    
    とする。過去に出会ったことのないtokenのf(w)も、この式でカバーされる。

Paul Graham方式では、特徴的な(0.5より遠い)pg(w)を持つ15のtokenを使用してcombining probabilityを求め、 それをメールのspam確率としている。

Gary Robinson方式では、以下で求めたSをメールのspam確率とする(bsfilterが表示するのはS2)。

P = 1 - ((1 - f(w1)) * (1 - f(w2)) * ... * (1 - f(wn))) ^ (1 / n)
Q = 1 - (f(w1) * f(w2) * ... * f(wn)) ^ (1 / n)
S = (P - Q) / (P + Q)
S2 = (1 + S) / 2

Q. Gary Robinson方式とGary Robinson-Fisher方式の違いは?

f(w)を求めるところまでは同じ。それ以降、Robinson-Fisher方式では以下のように計算する。

P = ((1 - f(w1)) * (1 - f(w2)) * ... * (1 - f(wn))) ^ (1 / n)
Q = (f(w1) * f(w2) * ... * f(wn)) ^ (1 / n)
P' = 1 - chi-square(-2 * log(P), 2 * n)
Q' = 1 - chi-square(-2 * log(Q), 2 * n)
S = (1 + P' - Q') / 2

10. バグ

コマンドライン引数で、mbox形式のファイルを指定することも可能であるが、 Content-lengthヘッダを見る機能がない。そのため、Solarisのように 本文中の"From"がエスケープされないシステムでは、本文中の"From"を UnixFromと見誤ることが考えられる。 bsfilterが表示するメールの番号と、MUAでの番号が食い違う障害が発生すると思われる。

コードが汚い。

bsfilterという名前が安直。

11. 更新履歴

筆者の日記bsfilterカテゴリで代用。

12. 情報源 / リンク

nbkenichi-bsfilter-f0a5a7c/docs/judge.png000066400000000000000000000022211465373635000205110ustar00rootroot00000000000000PNG  IHDRHArPLTEٟFIDATx?hW3!-S%4;I`R ҡEI"T|CCgS :O$`I }Rtw/wN}\SHyI?{O'+|Q>.+ SYJ=-\PޘM+h?G,YJm\n?ÒDa}1i(nYYsimde8˚} gYyn4 i~|ݟforG>neE*o<[p_Ҽvtm(7>\p-(fuf>?zn2b팄/͵VA$J bsfilter with mew

bsfilter with mew

index

bsfilterをmew version 5以降から使う

bsfilterを使用するよう、mewのspam関係の設定を行う。

インストール

  • mewのバージョンに合わせてmua/mew{5, 6, 6.4}/mew.elを~/.emacs.elなどに追加する。

使い方

mewがメッセージを取得する際、既にMTA, POP proxy等によりX-Spam-Flagヘッダが付いている場合には、spamに自動的に"D"マークが付く。

summaryモードで以下が利用できる。

lh
learn-ham。現在のメッセージをcleanとして学習する。
ls
learn-spam。現在のメッセージをspamとして学習する。
lm
リージョン内のメイルを学習、判定。spamには"*"マークが付く。(mew 6.4以降)
bm
リージョン内のメイルを学習、判定。spamには"*"マークが付く。(mew 6.3以前)

bsfilterをmew version 4から使う

bsfilterを使用するよう、mewのspam関係の設定を行う。

インストール

  • mua/mew4/emacs.elを~/.emacs.elに追加する。
  • mua/mew4/mew.elを~/.mew.elに追加する。Mew 4.2.53の場合は、mew-summary-bsfilter-mark-regionについて、コメントアウトされている定義を使用する。

使い方

mewがメッセージを取得する際、既にMTA, POP proxy等によりX-Spam-Flagヘッダが付いている場合には、spamに自動的に"D"マークが付く。

summaryモードで以下が利用できる。

lh
learn-ham。現在のメッセージをcleanとして学習する。
ls
learn-spam。現在のメッセージをspamとして学習する。
bm
リージョン内のメイルを学習、判定。spamには"*"マークが付く。

bsfilter mew version 3 front-endとは

mew version 3のsummaryモードからbsfilterの呼び出しを可能にする、emacs lispとshell script。

インストール

  • mua/mew3/mew.elを~/.mew.elに追加する。
  • mua/mew3/bs_mark、bs_clean、bs_spamを、PATHの通ったディレクトリに置く。

使い方

summaryモードで以下が利用できる。

bm
カーソルのある行のメイルを、cleanかspamか判定。判定結果に従い、メイルのヘッダに"X-Spam-Flag: Yes"などを追加し、"*"マークを付ける。
bM
bmと同じ処理を、"@"マークがついている全てのメイルに対して行う。 「"X-Spam-Flag: Yes"はSPAMフォルダに移動する」というような振り分けルールを設定しておき、bMの後にC-u M-oするのがお薦め。
bc
カーソルのある行のメイルに対して、--add-clean --sub-spam --updateを行う。 -aでの誤判定(cleanをspamと誤判定)に基づき更新されたデータベースを修復するのに使用できる。
bC
bcと同じ処理を、"@"マークがついている全てのメイルにたいして行う。
bs
カーソルのある行のメイルに対して、--sub-clean --add-spam --updateを行う。 -aでの誤判定(spamをcleanと誤判定)に基づき更新されたデータベースを修復するのに使用できる。
bS
bsと同じ処理を、"@"マークがついている全てのメイルにたいして行う。

index

nbkenichi-bsfilter-f0a5a7c/docs/pop-with-bsfilter.png000066400000000000000000000073061465373635000230030ustar00rootroot00000000000000PNG  IHDR1uPLTEٟ{IDATx_lGpH-6O$j꫔*}@E-A$B o Q۔;oPPӊPY9}k%怋5 5iZ*r4M7Kwg~;|uxdtPaP9 c<_fY>.m:qyoIUˣ}NrY>. .Sn'Ǿge=>M?iك< eߙwhR}O^ߺy}Uv^[?.)iԶG:ճA1,G1*A|dcecedcK]`0p^ d{Muf#C?fkFs G>GlQO/϶>G>n˞cPUMoCߊz<#}"7ƞʣyn(G < 4?PYn5AyXɼ8C oWc<8'><}Oao }2O~n~_g9y4i+s8+/sm_oeľKc po|sgwrH Ε{(]}Ø]Wߙ+XO_ѨW *7m}ZOz\~pDB?lWxo?zH c1?l;oX_ œd^蟝bsK77c7{ۿs'yT/#WO h>Q/j$W_ʑ/_`o_n[9 yycV7&5-.vL7soyy굨 Vt=wR=f>{ތ:R?aj >1H܏.>VO0 J'Fkܷ4Hԯbe;OnE*ll?>e_aVʷU)F3qY>&RM^yGU|N(ߗd~+|>-^nī* ' B~iꯨ-).׿GZ}iU%^L_Q_u0h&bf_WvsjP裻Ϻ|?Aޜco.Q7(SM;|sy"ňI;~vW_]iߨVazS8F0ן6 }!;X9r%'9}tX\<-X/ }N[7kzK3[ nyfZ9ro4CߵkD}ᓿ/!?7}f\h3񯞛/ό,߻kg^} 7:j/._}D\?|D깏 gPv @eh[ԥqMMf M~-pQ/a9lT?G||434t?~={[j ?T~$?+ Te6z Gir%SM;#c~? ko>K'G6&`W::2>:{͐;L!=~!f$C>0?ڀL$RCӬ.hcEa_Ϗ>[qh| |IԿ}{|KizN)ׯ:]_g ãYO)Y;W5:3]g.C[v}i(`w}_,isKF`C'G}oH#!1fsS0 ̿R\~t"Sny}~t|]>=oҔ}:b~9O.&0kLs:oÄd89gdΑ8!9op$?E>cDo7QSڲ C8~㳨ѯwč׏#}L][>]7Uzqq^]c%D3:>I?8L]~0̖8)}|=h?S꾸[r]u q聿 /l9Z,6P'>~(//y;zz*B~{@(q9 <￵C\dkdO>_'p/jI??p3蟯j?cja?$ƿ S)<}J<}Q }Q"3" n _$ ?,?q'_3R~l07k3CcT:rFBJ/l0kwLO8nI֧x`!W|jǨn`O53K O_?rΠs} *ETџJ}+ȿfftMk `9S_+5EyxG?~xӉ:~^nK{7ouKO_|a!i7ѧO|{T8^l[Z78I?Y}s7%q@_V C޸#yؾGƽ|6ߐ}hOV /3h4 v]~xBj˩7k}+k}|{{{A {!?8}IENDB`nbkenichi-bsfilter-f0a5a7c/docs/pop-without-bsfilter.png000066400000000000000000000016151465373635000235300ustar00rootroot00000000000000PNG  IHDRrPLTEٟBIDATh1hQGuSNCIQDA%Z)2TA!StM5bP* Jk$%zjiᅬ CjDR3 (#Eqd&8282ِTry )Y=w QEU䱧#FfΣʙ)3~I&$lY*d޴Ö P2ȋ5rftr9~^Yuħۑӏ_\:z M!Něeёt|Y+i%C"՗O#;5[a{ɠ!tɠ,l)m6-,yQY@-v8L 62h,!1!Fې˛dǠd!#&Jې(mC*K&862Jh˻egk[iei -68'폋zq-Xƭ7]y'qrɼԸ;]ɼ# LK ;y K&㮁`;yBH<%\E6xG#M \]GO#d>q8  ?T&Q&U&R&_EO{P~RIN0l =*ǾD1ʥqDՎ}펍&Ux3]o^چ-zz3~OvzJh\Ә叢ɋ%^djҜq4oKx\N&cw\t3Lr7Aʕg9hK-E`MD^Ȯh)WT5$P曄Tn$B:C- /b 1dөxӜ\CDv%leC.^~ Rq;-\,ˢ3ri TriP'HPZ~ /O n0dɕdxyڽuX3ؑFrЇ2j'pVCGF8e/Ǝ# @@wl/Qkнo\ WFi7nk)xA{7^دk^.^o7axbL/‹±\lXfQ|άYD q>g3{F77!oA;#!|)B = &LJ?TDT`laPH^QViᓩ3Xk+j%WJ>:obl!_C8U!`L/[Z7E׼d. %vFǬECU4Y*'Bc~i.WJ~UR.$!x]Are ~2.Z{ ym{5rYte޿[e/.^IV,mHBf\,oʏYӜm>Ìˇgm4msweY]8 (گܶ͜_qN{z2}()GndRĞlprt!\)Z=iE~߶?e,3'NH]I{ssKx9eK-uC._nV 뀗K Lo0Bf'ӈJfgވ# o܍+A&ŢIENDB`nbkenichi-bsfilter-f0a5a7c/docs/pops-with-bsfilter.png000066400000000000000000000026141465373635000231630ustar00rootroot00000000000000PNG  IHDRrPLTEٟAIDATh_h[U0}؋F6u S6C'úA֮F{#R-ݪ UCP2DLN۰v'!x{=InJ9,6}χܓKt&K X`YkHX#lHX#,˱Ɏp YceP@>_r}h8Q{<)DJ@ȦX~?2{rkXQ>p-^_+JSOsǒۿ~|jo6X 'c%g(b[Plv8},i ?rpMv@v2ѧ׃1ViHXi{($rL ف/9X/9Hnz; Y``RE9H۴9X@Y&Md eSQ z܀eIdlHrMeCeeC-(gea)$+.(Geaˆ$Ɋˆ$'(sl+.$+.z?!YroBFpY'BG] ^d9QaY,#Pz"DBEK,uH6eSXO"e9k2cek L/L<"%E3򌸞`>8ˆ,GٟrrytgL5)tOA1fF9LYEwe:'(tN.(9X~|7Erͽ:r$ƽʕlByidjw?:)Ov_BytWv(TgdOV:#F4 {uNʗFt0J8{9'@ɦjl74O_3}r ;SBٛG{#w9s> 'Nkː9ɑ^eqTޛN<.+ @~͍#["K*˹߭sy^RYQA${Iuٌk'7wv,P68[H&@yy뻩Fò0dtz$ f>]=NdXʵ.\4[Ւo\޼)qv\uq!#3͢3;ϥ7E2Ire )Mf·8MS8̜q,Ǟ1g&˱gy'r(sFks˜+,G? 2lIENDB`nbkenichi-bsfilter-f0a5a7c/docs/pops-without-bsfilter.png000066400000000000000000000020211465373635000237030ustar00rootroot00000000000000PNG  IHDRrPLTEٟIDAThAHQuR 8aЅE):ySv ugIKRj[FR{XGxX K<23h;Lt?֙7`Bj|D4g hG^pd&8282LYG7~W*֛l$g`usPruLh=O'wuOO.6;PƬ^ e{lT(#Jꎅđ2k0dJd<xZoc =M~Ed&;Nuxkl:Ź.+ƖdE9Y+i'C"ݗcN,JȦw8*Mb%9g!C-3_&Iaid!;Pd v;K&dYh2h{,-c=U-OKMʒQچ$Qچ61q's,YNbyl0$3OT(eoK/]pBX<?gS`xA~`b/298g[Y?Iַr^-3ˑ zg'Xy^||%+ɍՁŲXwß;?<:~`ʩ_Nzn8F0ٴY{XoiT#g4"籖\Α0Iґ04_/dJ0԰-e+H4cxm^u,̆sLy7h#W.{)w<;C-Sw<`0G ,!#008ff7?n>IENDB`nbkenichi-bsfilter-f0a5a7c/docs/prepare.png000066400000000000000000000042051465373635000210550ustar00rootroot00000000000000PNG  IHDREO PLTEٟ:IDATxlUAE*? ̄AMcB$DLc- 0? qaRY0$!qM<1)eirImTj{w[[ro^m m ډSs>[qŅpR\3y}X\pZ!hC{لJXUʻ;+\'nv,,x[=Oۦo~ʹ$H虌Y/ǓBm<~$4`J zwSJ҂Ҿ u=NiSBeSI/(UeFbI1Z~l__(ܠ [6#nj#Lƃ|!-h҂"+|B+-tK BXV(A91ǹ`wRF]$BÌrH]LE{T*4҅AXx6JaXچ<&a5P|)3C()9!_wJDgtPSH 'k>XK >݌ R 1AWPaͥ+p&U[935pP*^%fI>wk7 ܭ6b GXjWk=ق.)[sT6~@e,,Ռ(r9 *`kMG^А'r 3c-u0Yp x0T!̳.Żk"Cq O]{/&9d9X `s^KHM ~?t'Zj~vCѹ?ڐlCt9 &P(C'P(F U!B I*"F]N )"i44@0NK ڪU!;\ .g ")GHʅ+r{,R.tEʅٳH !xM易횀BB`w *{ y@+ =`.X{6(- pU)v|+D?+Dؔƿ4 HG^"7dO6[L#pFsGê񝧐1;o]L؀L2[g)z&Rdg O!TnDi3z#ܸ0zs}0X;6a9PP7Ҟzl- ?an(̌9npmsS0[<["16mJ^CeK{UlK^Dl^aƺFՙ./"VSDxa}Po1iv\uLOAt-N7r2TU ،y-^$ VpaH *H Cdc]W-a@s(ќ._9]@sh._ 4_/\|I[A) \5 •ᖝH E @=Dž`j{Յ7*/S?ǿ϶*y/_g0*x^ qbB0N7 "n@̨LIENDB`nbkenichi-bsfilter-f0a5a7c/mda/000077500000000000000000000000001465373635000165215ustar00rootroot00000000000000nbkenichi-bsfilter-f0a5a7c/mda/maildrop/000077500000000000000000000000001465373635000203305ustar00rootroot00000000000000nbkenichi-bsfilter-f0a5a7c/mda/maildrop/mailfilter.header000066400000000000000000000003421465373635000236310ustar00rootroot00000000000000# $Id: mailfilter.header,v 1.1 2004/02/29 04:53:35 nabeken Exp $ # you should make "Maildir/spam" by "maildirmake" xfilter "bsfilter --pipe --insert-flag --insert-probability" if (/^X-Spam-Flag: Yes/:h) { to Maildir/spam } nbkenichi-bsfilter-f0a5a7c/mda/procmail/000077500000000000000000000000001465373635000203275ustar00rootroot00000000000000nbkenichi-bsfilter-f0a5a7c/mda/procmail/procmailrc.black_gray000066400000000000000000000003511465373635000245010ustar00rootroot00000000000000# $Id: procmailrc.black_gray,v 1.1 2004/02/28 15:25:01 nabeken Exp $ :0 fw | bsfilter -a --pipe --insert-flag --insert-probability :0 * ^X-Spam-Probability: *(1|0\.[89]) Mail/black/. :0 * ^X-Spam-Probability: *0\.[67] Mail/gray/. nbkenichi-bsfilter-f0a5a7c/mda/procmail/procmailrc.header000066400000000000000000000002441465373635000236340ustar00rootroot00000000000000# $Id: procmailrc.header,v 1.1 2004/02/28 15:25:01 nabeken Exp $ :0 fw | bsfilter -a --pipe --insert-flag --insert-probability :0 * ^X-Spam-Flag: Yes Mail/spam/. nbkenichi-bsfilter-f0a5a7c/mda/procmail/procmailrc.status000066400000000000000000000001441465373635000237260ustar00rootroot00000000000000# $Id: procmailrc.status,v 1.1 2004/02/28 15:25:01 nabeken Exp $ :0 HB * ? bsfilter -a Mail/spam/. nbkenichi-bsfilter-f0a5a7c/mua/000077500000000000000000000000001465373635000165425ustar00rootroot00000000000000nbkenichi-bsfilter-f0a5a7c/mua/mew3/000077500000000000000000000000001465373635000174155ustar00rootroot00000000000000nbkenichi-bsfilter-f0a5a7c/mua/mew3/bs_clean000077500000000000000000000006711465373635000211150ustar00rootroot00000000000000#! /bin/sh ## $Id: bs_clean,v 1.1 2004/02/28 15:24:09 nabeken Exp $ bsfilter=bsfilter log=$HOME/.bsfilter/log umask 077 mark_one () { $bsfilter -u --add-clean --sub-spam --pipe --insert-flag < $target > $target.$$ 2>> $log if [ $? -eq 0 ]; then /usr/bin/touch -r $target $target.$$ mv $target.$$ $target fi } while [ "$*" != "" ]; do target=$1 if [ -r $target -a -f $target ]; then mark_one fi shift done nbkenichi-bsfilter-f0a5a7c/mua/mew3/bs_mark000077500000000000000000000006661465373635000207710ustar00rootroot00000000000000#! /bin/sh ## $Id: bs_mark,v 1.1 2004/02/28 15:24:09 nabeken Exp $ bsfilter=bsfilter log=$HOME/.bsfilter/log umask 077 mark_one () { $bsfilter -a --pipe --insert-flag --insert-probability < $target > $target.$$ 2>> $log if [ $? -eq 0 ]; then /usr/bin/touch -r $target $target.$$ mv $target.$$ $target fi } while [ "$*" != "" ]; do target=$1 if [ -r $target -a -f $target ]; then mark_one fi shift done nbkenichi-bsfilter-f0a5a7c/mua/mew3/bs_spam000077500000000000000000000006701465373635000207720ustar00rootroot00000000000000#! /bin/sh ## $Id: bs_spam,v 1.1 2004/02/28 15:24:09 nabeken Exp $ bsfilter=bsfilter log=$HOME/.bsfilter/log umask 077 mark_one () { $bsfilter -u --add-spam --sub-clean --pipe --insert-flag < $target > $target.$$ 2>> $log if [ $? -eq 0 ]; then /usr/bin/touch -r $target $target.$$ mv $target.$$ $target fi } while [ "$*" != "" ]; do target=$1 if [ -r $target -a -f $target ]; then mark_one fi shift done nbkenichi-bsfilter-f0a5a7c/mua/mew3/mew.el000066400000000000000000000046321465373635000205340ustar00rootroot00000000000000;; $Id: mew.el,v 1.1 2004/02/28 15:24:09 nabeken Exp $ (define-key mew-summary-mode-map "bm" 'mew-bsfilter-mark) (define-key mew-summary-mode-map "bM" 'mew-bsfilter-mark-multi) (define-key mew-summary-mode-map "bs" 'mew-bsfilter-spam) (define-key mew-summary-mode-map "bS" 'mew-bsfilter-spam-multi) (define-key mew-summary-mode-map "bc" 'mew-bsfilter-clean) (define-key mew-summary-mode-map "bC" 'mew-bsfilter-clean-multi) (setq mew-refile-ctrl-multi nil) (setq mew-field-spec (reverse (append (list (car (reverse mew-field-spec))) '(("^X-Spam-Probability:$" t) ("^X-Spam-Flag:$" t)) (cdr (reverse mew-field-spec))))) (defun mew-bsfilter-cmd-msg (command) "Executing an external command specifying this message as an argument." (interactive) (mew-summary-goto-message) (mew-summary-msg (let* ((fld (mew-summary-folder-name)) (msg (mew-summary-message-number)) (file (mew-expand-folder fld msg))) (while (not (mew-which-exec command)) (setq command (read-string "Command: "))) (message (format "Executing %s for %s..." command msg)) (call-process command nil nil nil file) (message (format "Executing %s for %s...done" command msg))))) (defun mew-bsfilter-cmd-msgs (command) "Executing an external command specifying messages marked with '@' as arguments." (interactive) (mew-summary-multi-msgs (let () (while (not (mew-which-exec command)) (setq command (read-string "Command: "))) (message (format "Executing %s ..." command)) (apply 'call-process command nil nil nil FILES) (message (format "Executing %s ...done" command))))) (defun mew-bsfilter-mark () "mew-bsfilter-mark" (interactive) (mew-bsfilter-cmd-msg "bs_mark") (mew-summary-review)) (defun mew-bsfilter-mark-multi () "mew-bsfilter-mark-multi" (interactive) (mew-bsfilter-cmd-msgs "bs_mark") (mew-summary-mark-review)) (defun mew-bsfilter-spam () "mew-bsfilter-spam" (interactive) (mew-bsfilter-cmd-msg "bs_spam") (mew-summary-review)) (defun mew-bsfilter-spam-multi () "mew-bsfilter-spam-multi" (interactive) (mew-bsfilter-cmd-msgs "bs_spam") (mew-summary-mark-review)) (defun mew-bsfilter-clean () "mew-bsfilter-clean" (interactive) (mew-bsfilter-cmd-msg "bs_clean") (mew-summary-review)) (defun mew-bsfilter-clean-multi () "mew-bsfilter-clean-multi" (interactive) (mew-bsfilter-cmd-msgs "bs_clean") (mew-summary-mark-review)) nbkenichi-bsfilter-f0a5a7c/mua/mew4/000077500000000000000000000000001465373635000174165ustar00rootroot00000000000000nbkenichi-bsfilter-f0a5a7c/mua/mew4/emacs.el000066400000000000000000000000401465373635000210220ustar00rootroot00000000000000(setq mew-spam: "X-Spam-Flag:") nbkenichi-bsfilter-f0a5a7c/mua/mew4/mew.el000066400000000000000000000032111465373635000205250ustar00rootroot00000000000000;; $Id: mew.el,v 1.3 2006/01/14 07:38:50 nabeken Exp $ ; put "D" (defun mew-spam-bsfilter (val) (let ((case-fold-search t)) (if (string-match "yes" val) ?D))) ; put "o +sapm" at inc ;(defun mew-spam-bsfilter (val) ; (let ((case-fold-search t)) ; (if (string-match "yes" val) "+spam"))) (setq mew-inbox-action-alist '(("X-Spam-Flag:" mew-spam-bsfilter))) ; for "ls" (learn-spam) (setq mew-spam-prog "bsfilter") (setq mew-spam-prog-args '("-C" "-s" "-u")) ; for "lh" (learn-ham) (setq mew-ham-prog "bsfilter") (setq mew-ham-prog-args '("-c" "-S" "-u")) ; for "bm" (mark-spam) (define-key mew-summary-mode-map "bm" 'mew-summary-bsfilter-mark-region) (defun mew-summary-bsfilter-mark-region (&optional arg) "study/judge the region and put the '*' mark onto spams. need to re-learn if judgment of bsfilter is wrong" (interactive "P") (mew-pickable (let ((func 'mew-summary-pick-with-cmd) (mew-inherit-grep-cmd "bsfilter -a --list-spam")) (mew-summary-pick-body func t nil 'nopattern)))) ; code for Mew 4.2.53 by emon ; (defun mew-summary-bsfilter-mark-region (&optional arg) ; "study/judge the region and put the '*' mark onto spams. ; need to re-learn if judgment of bsfilter is wrong. ; press return-key twice at the prompt." ; (interactive "P") ; (mew-pickable ; (let ((mew-prog-grep "bsfilter") ; (mew-prog-grep-opts '("-a" "--list-spam"))) ; (mew-summary-pick t)))) ; show X-Spam-Flag and X-Spam-Probability in message buffer (setq mew-field-spec (reverse (append (list (car (reverse mew-field-spec))) '(("^X-Spam-Probability:$" t) ("^X-Spam-Flag:$" t)) (cdr (reverse mew-field-spec))))) nbkenichi-bsfilter-f0a5a7c/mua/mew5/000077500000000000000000000000001465373635000174175ustar00rootroot00000000000000nbkenichi-bsfilter-f0a5a7c/mua/mew5/mew.el000066400000000000000000000027451465373635000205410ustar00rootroot00000000000000;; $Id: mew.el,v 1.1 2006/05/22 13:52:58 nabeken Exp $ ; moved from .emacs.el (setq mew-spam: "X-Spam-Flag:") ; put "D" (defun mew-spam-bsfilter (val) (let ((case-fold-search t)) (if (string-match "yes" val) ?D))) ; put "o +sapm" at inc ;(defun mew-spam-bsfilter (val) ; (let ((case-fold-search t)) ; (if (string-match "yes" val) "+spam"))) (setq mew-inbox-action-alist '(("X-Spam-Flag:" mew-spam-bsfilter))) ; for "ls" (learn-spam) (setq mew-spam-prog "bsfilter") (setq mew-spam-prog-args '("-C" "-s" "-u")) ; for "lh" (learn-ham) (setq mew-ham-prog "bsfilter") (setq mew-ham-prog-args '("-c" "-S" "-u")) ; for "bm" (mark-spam) (define-key mew-summary-mode-map "bm" 'mew-summary-bsfilter-mark-region) ; code for mew-5.0.51 (defun mew-summary-bsfilter-mark-region (&optional arg) "study/judge the region and put the '*' mark onto spams. need to re-learn if judgment of bsfilter is wrong" (interactive "P") (mew-pickable (mew-summary-with-mewl (let* ((folder (mew-pickable-folder)) (msgs (mew-summary-pick-msgs folder t)) (prog "bsfilter") (opts '("-a" "--list-spam")) (pattern nil)) (setq msgs (mew-summary-pick-with-grep prog opts pattern folder msgs)) (mew-summary-pick-ls folder msgs))))) ; show X-Spam-Flag and X-Spam-Probability in message buffer (setq mew-field-spec (reverse (append (list (car (reverse mew-field-spec))) '(("^X-Spam-Probability:$" t) ("^X-Spam-Flag:$" t)) (cdr (reverse mew-field-spec))))) nbkenichi-bsfilter-f0a5a7c/mua/mew6.4/000077500000000000000000000000001465373635000175625ustar00rootroot00000000000000nbkenichi-bsfilter-f0a5a7c/mua/mew6.4/mew.el000066400000000000000000000027271465373635000207040ustar00rootroot00000000000000;; $Id: mew.el,v 1.2 2012/06/17 06:57:31 nabeken Exp $ ; moved from .emacs.el (setq mew-spam: "X-Spam-Flag:") ; put "D" (defun mew-spam-bsfilter (val) (let ((case-fold-search t)) (if (string-match "yes" val) ?D))) ; put "o +sapm" at inc ;(defun mew-spam-bsfilter (val) ; (let ((case-fold-search t)) ; (if (string-match "yes" val) "+spam"))) (setq mew-inbox-action-alist '(("X-Spam-Flag:" mew-spam-bsfilter))) ; for "ls" (learn-spam) (setq mew-spam-prog "bsfilter") (setq mew-spam-prog-args '("-C" "-s" "-u")) ; for "lh" (learn-ham) (setq mew-ham-prog "bsfilter") (setq mew-ham-prog-args '("-c" "-S" "-u")) ; for "lm" (mark-spam) (define-key mew-summary-mode-map "lm" 'mew-summary-bsfilter-mark-region) (defun mew-summary-bsfilter-mark-region (&optional arg) "study/judge the region and put the '*' mark onto spams. need to re-learn if judgment of bsfilter is wrong" (interactive "P") (mew-pickable (mew-summary-with-mewl (let* ((folder (mew-summary-physical-folder)) (msgs (mew-summary-pick-msgs folder t)) (prog "bsfilter") (opts '("-a" "--list-spam")) (pattern nil)) (setq msgs (mew-summary-pick-with-grep prog opts pattern folder msgs)) (mew-summary-pick-ls folder msgs))))) ; show X-Spam-Flag and X-Spam-Probability in message buffer (setq mew-field-spec (reverse (append (list (car (reverse mew-field-spec))) '(("^X-Spam-Probability:$" t) ("^X-Spam-Flag:$" t)) (cdr (reverse mew-field-spec))))) nbkenichi-bsfilter-f0a5a7c/mua/mew6/000077500000000000000000000000001465373635000174205ustar00rootroot00000000000000nbkenichi-bsfilter-f0a5a7c/mua/mew6/mew.el000066400000000000000000000027271465373635000205420ustar00rootroot00000000000000;; $Id: mew.el,v 1.1 2009/04/30 16:50:26 nabeken Exp $ ; moved from .emacs.el (setq mew-spam: "X-Spam-Flag:") ; put "D" (defun mew-spam-bsfilter (val) (let ((case-fold-search t)) (if (string-match "yes" val) ?D))) ; put "o +sapm" at inc ;(defun mew-spam-bsfilter (val) ; (let ((case-fold-search t)) ; (if (string-match "yes" val) "+spam"))) (setq mew-inbox-action-alist '(("X-Spam-Flag:" mew-spam-bsfilter))) ; for "ls" (learn-spam) (setq mew-spam-prog "bsfilter") (setq mew-spam-prog-args '("-C" "-s" "-u")) ; for "lh" (learn-ham) (setq mew-ham-prog "bsfilter") (setq mew-ham-prog-args '("-c" "-S" "-u")) ; for "bm" (mark-spam) (define-key mew-summary-mode-map "bm" 'mew-summary-bsfilter-mark-region) (defun mew-summary-bsfilter-mark-region (&optional arg) "study/judge the region and put the '*' mark onto spams. need to re-learn if judgment of bsfilter is wrong" (interactive "P") (mew-pickable (mew-summary-with-mewl (let* ((folder (mew-summary-physical-folder)) (msgs (mew-summary-pick-msgs folder t)) (prog "bsfilter") (opts '("-a" "--list-spam")) (pattern nil)) (setq msgs (mew-summary-pick-with-grep prog opts pattern folder msgs)) (mew-summary-pick-ls folder msgs))))) ; show X-Spam-Flag and X-Spam-Probability in message buffer (setq mew-field-spec (reverse (append (list (car (reverse mew-field-spec))) '(("^X-Spam-Probability:$" t) ("^X-Spam-Flag:$" t)) (cdr (reverse mew-field-spec))))) nbkenichi-bsfilter-f0a5a7c/src/000077500000000000000000000000001465373635000165475ustar00rootroot00000000000000nbkenichi-bsfilter-f0a5a7c/src/.rubocop.yml000066400000000000000000000014261465373635000210240ustar00rootroot00000000000000AllCops: TargetRubyVersion: 2.6 Style/FrozenStringLiteralComment: Enabled: false Style/ParenthesesAroundCondition: Enabled: false Style/StringConcatenation: Enabled: false Style/RedundantReturn: Enabled: false Metrics/MethodLength: Enabled: false Metrics/ClassLength: Enabled: false Metrics/ModuleLength: Enabled: false Metrics/AbcSize: Max: 20 Metrics: Enabled: false Style/FormatStringToken: Enabled: false Naming: Enabled: false Style/IfUnlessModifier: Enabled: false Style/NegatedIf: Enabled: false Style/RedundantParentheses: Enabled: false Layout/SpaceAfterNot: Enabled: false Style/GuardClause: Enabled: false Style/RescueStandardError: Enabled: false Style/ConditionalAssignment: Enabled: false Style/Next: Enabled: falsenbkenichi-bsfilter-f0a5a7c/src/bsfilter.rb000077500000000000000000003236771465373635000207330ustar00rootroot00000000000000#! /usr/bin/env ruby ## -*-Ruby-*- ## Copyright (C) 2003-2024 NABEYA Kenichi ## ## This program is free software; you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by ## the Free Software Foundation; either version 2 of the License, or ## (at your option) any later version. ## ## This program is distributed in the hope that it will be useful, ## but WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ## GNU General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with this program; if not, write to the Free Software ## Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA require 'English' require 'getoptlong' require 'nkf' class Bsfilter def initialize @threads = [] @token_dbs = nil @options = {} @db_hash = {} @jtokenizer = nil end attr_accessor :token_dbs Revision = 'GIT_HASH'.freeze Release = 'GIT_TAG'.freeze Languages = %w[C ja].freeze Default_Language = 'C'.freeze ## Options = Hash::new # used like a global variable ## DB = Hash::new Default_header_prefix = 'Spam'.freeze Default_spam_subject_prefix = '[SPAM] '.freeze Default_refer_header = %w[Ufrom From To Cc Subject Reply-to Return-path Received Content-Transfer-Encoding Content-Type charset Content-Disposition].join(',') Default_jtokenizer = 'bigram'.freeze Default_mark_in_token = "|!*'".freeze Default_homedir = '.bsfilter'.freeze Default_conf_file = 'bsfilter.conf'.freeze Default_pid_file = 'bsfilter.pid'.freeze Default_method = 'rf'.freeze # Robinson Fisher Default_db = 'sdbm'.freeze Default_max_mail = 10_000 Default_min_mail = 8000 Default_max_line = 500 Default_pop_proxy_if = '0.0.0.0'.freeze Default_pop_port = '110'.freeze Default_pop_proxy_port = '10110'.freeze Default_pop_max_size = 50_000 Default_imap_port = '143'.freeze Default_imap_auth = 'auto'.freeze Default_imap_auth_preference = %w[cram-md5 login loginc].freeze Default_icon_number = 32_512 Clean_ext = '.clean'.freeze Spam_ext = '.spam'.freeze Prob_ext = '.prob'.freeze Lock_ext = '.lock'.freeze NDBM_ext = '.ndbm'.freeze SDBM_ext = '.sdbm'.freeze GDBM_ext = '.gdbm'.freeze BDB1_ext = '.bdb1'.freeze BDB_ext = '.bdb'.freeze QDBM_ext = '.qdbm'.freeze EXIT_NORMAL = 0 CODE_NORMAL = true CODE_SPAM = true CODE_CLEAN = false LOG_CODESET = 'UTF-8'.freeze # codeset for verbose and debug message. nil => no conversion ALL_TAGS = %w[html head title meta body div spam h1 h2 h3 h4 h5 h6 em strong font basefont big small b i s u tt sub sub rb rp rt ruby blink marquee dfn cite abbr acronym blockquote q br pre ins del center style hr ul ol li dl dt dd table caption thead tbody tfoot colgroup col tr td th a link base img address form input select option textarea label fieldset legend optgroup frameset frame nofrmaes iframe].join('|') SPACE_TAGS = 'br|p|td|tr|table|ul|ol|dl|li|dt|dd'.freeze RE_ALL_TAGS = Regexp.compile('\A<(' + ALL_TAGS + ')\b', Regexp::IGNORECASE) RE_SPACE_TAGS = Regexp.compile('\A<(' + SPACE_TAGS + ')\b', Regexp::IGNORECASE) SOCKET_TIMEOUT = 30 # for single socket operation module Bsutil def insert_header!(buf, header, content) buf[0] =~ /([\r\n]*)\z/ eol = ::Regexp.last_match(1) (0...buf.length).each do |i| if (i.zero? && # unix from line (buf[i] =~ /\A>?from\s+(\S+)/)) next elsif (buf[i] =~ /\A(.*?:)/) h = ::Regexp.last_match(1) if (h == header) buf[i] = "#{header} #{content}#{eol}" return end elsif (buf[i] =~ /\A\s+\S/) # folded header next elsif (buf[i] =~ /\A[\r\n]*\z/) # separator between header and body buf[i, 0] = "#{header} #{content}#{eol}" return else # not header. may be body without separator buf[i, 0] = "#{header} #{content}#{eol}" return end end buf.push("#{header} #{content}#{eol}") end def append_header!(buf, header, prefix) buf[0] =~ /([\r\n]*)\z/ eol = ::Regexp.last_match(1) append_done = false (0...buf.length).each do |i| if (buf[i] =~ /\A(.*?:)(\s*)(.*?)([\r\n]*)\z/) h = ::Regexp.last_match(1) org_content = ::Regexp.last_match(3) if (h.downcase == header.downcase) buf[i] = "#{header} #{prefix}#{org_content}#{eol}" append_done = true end elsif (!append_done && (((buf[i] =~ /\A\S/) && (buf[i] !~ /\A\S+:/)) || # found body without separator (buf[i] =~ /\A[\r\n]*\z/))) # separator between header and body buf[i, 0] = "#{header} #{prefix}#{eol}" append_done = true break end end buf.push("#{header} #{prefix}#{eol}") if (! append_done) end def x_spam_flag return format('X-%s-Flag:', @options['header-prefix']) end def x_spam_probability return format('X-%s-Probability:', @options['header-prefix']) end def x_spam_revision return format('X-%s-Revision:', @options['header-prefix']) end def insert_headers!(buf, spam_flag, probability = nil) updated = false if (@options['insert-revision']) insert_header!(buf, x_spam_revision, "bsfilter release #{Release} revision #{Revision}") updated = true end if (@options['insert-flag']) updated = true if spam_flag insert_header!(buf, x_spam_flag, 'Yes') else insert_header!(buf, x_spam_flag, 'No') end end if (@options['insert-probability'] && probability) updated = true insert_header!(buf, x_spam_probability, format('%f', probability)) end if (@options['mark-spam-subject']) updated = true append_header!(buf, 'Subject:', @options['spam-subject-prefix']) if spam_flag end return updated end end include Bsutil class DevNull def sync=(*args); end def print(*args); end def printf(*args); end end class DBHash < Hash def flatten(magic = '###', head = '', &block) each do |k, v| if v.instance_of?(DBHash) if (head == '') v.flatten(magic, k, &block) else v.flatten(magic, head + magic + k, &block) end elsif (head == '') yield k, v else yield head + magic + k, v end end end def add(hash) hash.each do |k, v| if (self[k]) if (self[k].instance_of?(DBHash) && v.instance_of?(DBHash)) self[k].add(v) else self[k] += v end else self[k] = v # should do deep copy ? end end end def sub(hash) hash.each do |k, v| if (self[k]) if (self[k].instance_of?(DBHash) && v.instance_of?(DBHash)) self[k].sub(v) delete(k) if self[k].empty? elsif (self[k] > v) self[k] -= v else delete(k) end end end end end def safe_require(file) require file return true rescue LoadError return false end def latin2ascii(str) str.force_encoding('ASCII-8BIT') newstr = str.tr("\x92\x93\x94".dup.force_encoding('ASCII-8BIT'), "'''") newstr.tr!("\xc0-\xc5\xc8-\xcb\xcc-\xcf\xd2-\xd6\xd9-\xdc".dup.force_encoding('ASCII-8BIT'), 'AAAAAAEEEEIIIIOOOOOUUUU') newstr.tr!("\xe0-\xe5\xe8-\xeb\xec-\xef\xf2-\xf6\xf9-\xfc".dup.force_encoding('ASCII-8BIT'), 'aaaaaaeeeeiiiiooooouuuu') return newstr end def u2eucjp(str) return NKF.nkf('-e -E -X -Z0', str.encode('EUC-JP', 'UTF-8', undef: :replace, invalid: :replace)).validate_encoding end def u2latin(str) return str.encode('US-ASCII', 'UTF-8', undef: :replace, invalid: :replace) end def gb180302eucjp(str) return str.encode('EUC-JP', 'BIG5', undef: :replace, invalid: :replace) end def open_ro(file) if (file == '-') fh = $stdin yield fh elsif file.instance_of?(Array) file.instance_eval <= self.length) nil else @n = @n + 1 self[@n - 1] end end def readlines @eof = true self end def eof? (@eof || empty?) end EOM yield file else if (! FileTest.file?(file)) raise format('%s is not file', file) end fh = File.open(file, 'rb') yield fh fh.close end end def open_wo(file, &block) if (file == '-') fh = $stdout else fh = open(file, 'wb') end if (block) yield fh if (file != '-') fh.close end else return fh end end class FLOAT def initialize(f = 0, power = 1) @mant = 0 @exp = 0 set_f(f, power) end attr_accessor :mant, :exp def to_f return @mant * Math.exp(@exp) end def ln return Math.log(@mant) + @exp end def *(a) n = FLOAT.new if a.instance_of?(FLOAT) n.mant = @mant * a.mant n.exp = @exp + a.exp else n.exp = @exp n.mant = @mant * a end return n end def set_f(a, power = 1) if a.positive? @mant = 1 @exp = Math.log(a) * power elsif a.negative? @mant = -1 @exp = Math.log(-a) * power else @mant = 0 @exp = 0 end self end end module TokenAccess def check_size(max_size, min_size) if ((@file_count <= max_size) || (max_size <= 0) || (min_size <= 0)) return false end old_count = @file_count if (@options['verbose']) @options['message-fh'].printf("reduce token database %s from %d to %d\n", @filename, old_count, min_size) end key_cts.each do |(category, token)| if (category != '.internal') v = value(category, token) || 0 sub_scalar(category, token, (v * (old_count - min_size).to_f / old_count.to_f).ceil) if (@options['debug'] && ! value(category, token)) @options['message-fh'].printf("deleted %s %s\n", category, token.to_utf8) end end end @file_count = min_size @dirty = true return true end def value_with_degene(category, token) if value(category, token) return value(category, token) elsif (!@options['degeneration']) # no degeneration return nil else if (v = value(category, token[0..-2])) # cut last char return v end token = token.gsub(Regexp.compile("[#{@options['mark-in-token']}]"), '') if (v = value(category, token)) return v end token = token.downcase if (v = value(category, token)) return v end token = token.upcase if (v = value(category, token)) return v end token = token.capitalize if (v = value(category, token)) return v end return nil end end def set_scalar(category, token, val) @dirty = true @file_count += 1 set(category, token, val) end def add_scalar(category, token, val) @dirty = true @file_count += 1 if (v = value(category, token)) set(category, token, v + val) else set(category, token, val) end end def show_new_token(db) db.each_ct do |category, token| if (!value(category, token) || value(category, token).zero?) @options['message-fh'].printf("new %s %s\n", category, token.to_utf8) end end end def values array = [] each_ct do |c, t| array.push(value(c, t)) end return array end def key_cts array = [] each_ct do |c, t| array.push([c, t]) end return array end def export(fh) each_ct do |category, token| fh.printf("%s %s %s %g\n", @language, category, token, value(category, token)) if value(category, token) end end end class TokenDB include TokenAccess def initialize(language = nil) @hash = DBHash.new @file_count = 0 @language = language @message_id = '-' @probability = nil @spam_flag = nil @dirty = false @time = nil @filename = '-' end attr_accessor :hash, :file_count, :probability, :language, :spam_flag, :message_id, :time, :filename def size @hash.size end def each_ct @hash.each_key do |category| @hash[category].each_key do |token| yield(category, token) end end end def value(category, token) if (!@hash[category]) return nil elsif (v = @hash[category][token]) return v else return nil end end def set(category, token, v) @dirty = true @hash[category] = DBHash.new if (! @hash[category]) @hash[category][token] = v end def print_keys_to_str(hash, separator, fh = $stdout) hash.keys.sort.each do |k| v = hash[k] v = v.to_i fh.print separator fh.print(([k] * v).join(separator)) end end def clear @dirty = true @file_count = 0 @hash = DBHash.new end def add_db(db) @dirty = true @file_count += db.file_count @language = db.language if (!@language && db.language) @hash.add(db.hash) end def add_hash(hash) @dirty = true @file_count += 1 @hash.add(hash) end def sub_scalar(category, token, val) @file_count -= 1 if @file_count.positive? @hash.sub({ category => { token => val } }) end def sub_hash(hash) @dirty = true @file_count -= 1 if @file_count.positive? @hash.sub(hash) end def sub_db(db) @dirty = true @file_count -= db.file_count @file_count = 1 if (@file_count < 1) @hash.sub(db.hash) end end class TokenDBM include TokenAccess MAGIC = '###'.freeze def initialize(options, language, _ext) @options = options @dbm = nil # SDBM not Hash @dirty = nil # not used. for TokenAccess @lockfh = nil @file_count = nil @language = language end attr_accessor :file_count def size @dbm.size end def to_db token_db = TokenDB.new(@language) @dbm.each do |ct, v| (category, token) = ct.split(Regexp.new(MAGIC), 2) token_db.set(category, token, v) token_db.file_count = @file_count end return token_db end def clear @dbm.clear @file_count = 0 set('.internal', 'file_count', 0) end def each_ct @dbm.each_key do |ct| (category, token) = ct.force_encoding('ASCII-8BIT').split(Regexp.new(MAGIC), 2) yield(category, token) if (category && token) end end def add_db(token_db) add_hash(token_db.hash) @file_count += + token_db.file_count end def add_hash(hash) @dirty = true hash.flatten(MAGIC) do |k, v| if (@dbm[k]) @dbm[k] = (@dbm[k].to_f + v.to_f).to_s else @dbm[k] = v.to_s end end end def sub_db(token_db) sub_hash(token_db.hash) if (@file_count > token_db.file_count) @file_count -= token_db.file_count else @file_count = 0 end end def sub_hash(hash) @dirty = true hash.flatten(MAGIC) do |k, v| if (@dbm[k]) if (@dbm[k].to_f > v.to_f) @dbm[k] = (@dbm[k].to_f - v.to_f).to_s else @dbm.delete(k) end end end end def value(category, token) v = @dbm[category + MAGIC + token] return v.to_f if v return nil end def set(category, token, v) @dirty = true begin @dbm[category + MAGIC + token] = v.to_s rescue @options['message-fh'].puts($ERROR_INFO.inspect, category + MAGIC + token, v.to_s) if (@options['verbose']) @options['message-fh'].puts($ERROR_POSITION) if (@options['debug']) end end def sub_scalar(category, token, v) @dirty = true if (@file_count.positive?) @file_count -= 1 end oldv = value(category, token) if (oldv) if (oldv > v) set(category, token, oldv - v) else @dbm.delete(category + MAGIC + token) end end end def open(mode = 'r') @lockfh = File.open(@lockfile, 'w+') case mode when 'r' begin @lockfh.flock(File::LOCK_SH) rescue Errno::EINVAL ## Win9x doesn't support LOCK_SH @lockfh.flock(File::LOCK_EX) end when 'w', 'wr', 'rw' @lockfh.flock(File::LOCK_EX) else raise "internal error: unknown mode #{mode}" end @dbm = open_dbm(@filename, 0o600) if (v = value('.internal', 'file_count')) @file_count = v.to_i else @file_count = 0 set('.internal', 'file_count', @file_count) end if (@options['verbose']) @options['message-fh'].printf("open %s %d tokens %d mails by %d.\n", @filename, @dbm.length, @file_count, Process.pid) end @dirty = false end def close dirty = @dirty set('.internal', 'file_count', @file_count) if dirty if (@options['verbose']) @options['message-fh'].printf("close %s %d tokens %d mails by %d.\n", @filename, @dbm.length, @file_count, Process.pid) end if (@options['debug'] && dirty) key_cts.sort.each do |(c, t)| @options['message-fh'].printf("%s %s %s %f\n", @filename, c, t.to_utf8, value(c, t)) end end @dbm.close @lockfh.flock(File::LOCK_UN) @lockfh.close @dirty = false end end class TokenNDBM < TokenDBM def initialize(options, language, ext) @filename = options['homedir'] + language + ext + NDBM_ext @lockfile = options['homedir'] + language + ext + NDBM_ext + Lock_ext super end def clear @file_count = 0 @dbm.close begin if (@options['verbose']) @options['message-fh'].printf("unlink %s by %d.\n", @filename, Process.pid) end File.unlink(@filename + '.db') rescue @options['message-fh'].puts($ERROR_INFO.inspect) if (@options['verbose']) @options['message-fh'].puts($ERROR_POSITION) if (@options['debug']) end @dbm = open_dbm(@filename, 0o600) if (@options['verbose']) @options['message-fh'].printf("reopen %s by %d.\n", @filename, Process.pid) end end def open_dbm(filename, mode) DBM.open(filename, mode) end end class TokenSDBM < TokenDBM def initialize(options, language, ext) @filename = options['homedir'] + language + ext + SDBM_ext @lockfile = options['homedir'] + language + ext + SDBM_ext + Lock_ext super end def clear @file_count = 0 @dbm.close begin if (@options['verbose']) @options['message-fh'].printf("unlink %s by %d.\n", @filename, Process.pid) end File.unlink(@filename + '.dir') File.unlink(@filename + '.pag') rescue @options['message-fh'].puts($ERROR_INFO.inspect) if (@options['verbose']) @options['message-fh'].puts($ERROR_POSITION) if (@options['debug']) end @dbm = open_dbm(@filename, 0o600) if (@options['verbose']) @options['message-fh'].printf("reopen %s by %d.\n", @filename, Process.pid) end end def open_dbm(filename, mode) SDBM.open(filename, mode) end end class TokenGDBM < TokenDBM def initialize(options, language, ext) @options = options @filename = @options['homedir'] + language + ext + GDBM_ext @lockfile = @options['homedir'] + language + ext + GDBM_ext + Lock_ext super end def clear @file_count = 0 @dbm.close begin if (@options['verbose']) @options['message-fh'].printf("unlink %s by %d.\n", @filename, Process.pid) end File.unlink(@filename) rescue @options['message-fh'].puts($ERROR_INFO.inspect) if (@options['verbose']) @options['message-fh'].puts($ERROR_POSITION) if (@options['debug']) end @dbm = open_dbm(@filename, 0o600) if (@options['verbose']) @options['message-fh'].printf("reopen %s by %d.\n", @filename, Process.pid) end end def open_dbm(filename, mode) GDBM.open(filename, mode, GDBM::NOLOCK) end end class TokenBDB1 < TokenDBM def initialize(options, language, ext) @filename = options['homedir'] + language + ext + BDB1_ext @lockfile = options['homedir'] + language + ext + BDB1_ext + Lock_ext super end def clear @file_count = 0 @dbm.close begin if (@options['verbose']) @options['message-fh'].printf("unlink %s by %d.\n", @filename, Process.pid) end File.unlink(@filename) rescue @options['message-fh'].puts($ERROR_INFO.inspect) if (@options['verbose']) @options['message-fh'].puts($ERROR_POSITION) if (@options['debug']) end @dbm = open_dbm(@filename, 0o600) if (@options['verbose']) @options['message-fh'].printf("reopen %s by %d.\n", @filename, Process.pid) end end def open_dbm(filename, mode) BDB1::Hash.open(filename, BDB1::CREATE | BDB1::WRITE, mode) end end class TokenBDB < TokenDBM def initialize(options, language, ext) @filename = options['homedir'] + language + ext + BDB_ext @lockfile = options['homedir'] + language + ext + BDB_ext + Lock_ext super end def clear @file_count = 0 @dbm.close begin if (@options['verbose']) @options['message-fh'].printf("unlink %s by %d.\n", @filename, Process.pid) end File.unlink(@filename) rescue @options['message-fh'].puts($ERROR_INFO.inspect) if (@options['verbose']) @options['message-fh'].puts($ERROR_POSITION) if (@options['debug']) end @dbm = open_dbm(@filename, 0o600) if (@options['verbose']) @options['message-fh'].printf("reopen %s by %d.\n", @filename, Process.pid) end end def open_dbm(filename, mode) BDB::Hash.open(filename, nil, BDB::CREATE, mode) end end class TokenQDBM < TokenDBM def initialize(options, language, ext) @filename = options['homedir'] + language + ext + QDBM_ext @lockfile = options['homedir'] + language + ext + QDBM_ext + Lock_ext super end def value(category, token) v = @dbm[category + MAGIC + token] rescue DepotError_ENOITEM return nil else return v.to_f end def add_hash(hash) @dirty = true hash.flatten(MAGIC) do |k, v| if (@dbm[k]) @dbm[k] = (@dbm[k].to_f + v.to_f).to_s else @dbm[k] = v.to_s end end end def clear @file_count = 0 @dbm.close begin if (@options['verbose']) @options['message-fh'].printf("unlink %s by %d.\n", @filename, Process.pid) end File.unlink(@filename) rescue @options['message-fh'].puts($ERROR_INFO.inspect) if (@options['verbose']) @options['message-fh'].puts($ERROR_POSITION) if (@options['debug']) end @dbm = open_dbm(@filename, 0o600) if (@options['verbose']) @options['message-fh'].printf("reopen %s by %d.\n", @filename, Process.pid) end end def open_dbm(filename, _mode) Depot.open(filename, Depot::OWRITER | Depot::OCREAT) end end def get_lang_from_headers(headers) reg_char_ja = Regexp.compile('\?(iso-2022-jp|iso-2202-jp|x.sjis|shift.jis|euc.jp)\?', Regexp::IGNORECASE) reg_jis = Regexp.compile('\\x1b\\x24[\\x42\\x40]', nil) # escape sequence to jisx0208 new and old @options['refer-header'].each_key do |header_name| str = headers[header_name] if (str) case str when reg_char_ja @options['message-fh'].printf("lang ja header char_ja\n") if (@options['debug']) return ['ja', nil] when reg_jis @options['message-fh'].printf("lang ja header jis\n") if (@options['debug']) return %w[ja jis] end end end return [nil, nil] end def get_lang_from_buf(buf, html_flag) return get_lang(buf, html_flag) end def get_lang(buf, html_flag) reg_euc = Regexp.compile("[\xa1\xa2-\xa1\xbc\xa4\xa1-\xa4\xf3\xa5\xa1-\xa5\xf6]{4}".dup.force_encoding('EUC-JP')) reg_sjis = Regexp.compile("[\x81\x40-\x81\x5b\x82\x9f-\x82\xf1\x83\x40-\x83\x96]{2}".dup.force_encoding('SHIFT_JIS')) reg_utf8 = Regexp.compile("[\xe3\x80\x80-\xe3\x80\x82\xe3\x81\x81-\xe3\x82\x93\xe3\x82\xa1-\xe3\x83\xb6]{4}".dup.force_encoding('UTF-8')) reg_jis = Regexp.compile('\\x1b\\x24[\\x42\\x40]'.dup.force_encoding('ASCII-8BIT')) reg_gb18030_possible = Regexp.compile('[\x80-\x9f]'.dup.force_encoding('ASCII-8BIT')) gb18030_possible = false buf.each do |str| str = decode_character_reference2u(str) if html_flag gb18030_possible = true if (str.force_encoding('ASCII-8BIT') =~ reg_gb18030_possible) str_utf8 = str.encode('UTF-16BE', 'UTF-8', undef: :replace, invalid: :replace).encode('UTF-8', 'UTF-16BE', undef: :replace, invalid: :replace) str_sjis = str.encode('UTF-16BE', 'SHIFT_JIS', undef: :replace, invalid: :replace).encode('SHIFT_JIS', 'UTF-16BE', undef: :replace, invalid: :replace) str_euc = str.encode('UTF-16BE', 'EUC-JP', undef: :replace, invalid: :replace).encode('EUC-JP', 'UTF-16BE', undef: :replace, invalid: :replace) if (str_utf8 =~ reg_utf8) @options['message-fh'].printf("lang ja utf8\n") if (@options['debug']) return %w[ja utf8] elsif (str.force_encoding('ASCII-8BIT') =~ reg_jis) @options['message-fh'].printf("lang ja jis\n") if (@options['debug']) return %w[ja jis] elsif (str_sjis =~ reg_sjis) @options['message-fh'].printf("lang ja sjis\n") if (@options['debug']) return %w[ja sjis] elsif (str_euc =~ reg_euc) if gb18030_possible @options['message-fh'].printf("lang ja gb18030\n") if (@options['debug']) return %w[ja gb18030] else @options['message-fh'].printf("lang ja euc\n") if (@options['debug']) return %w[ja euc] end end end return [nil, nil] end def get_headers(buf, lang) headers = DBHash.new buf = buf.dup header_buf = [] if ((buf[0] !~ /\A>?from\s+(\S+)/i) && # this isn't mail (buf[0] !~ /\A(\S+):/)) if (@options['max-line'] <= 0) return [headers, buf, lang] else return [headers, buf[0..@options['max-line']], lang] end end num_of_dquote = 0 ignore_dquote = false while (str = buf.shift) header_buf.push(str) str = str.chomp if (str =~ /\A(\S+?):\s*(.*)/) current = ::Regexp.last_match(1).downcase if (current == 'received') headers[current] = ::Regexp.last_match(2).sub(/[\r\n]*\z/, '') else headers[current] = (headers[current] || '') + ' ' + ::Regexp.last_match(2).sub(/[\r\n]*\z/, '') end elsif (str =~ /\A>?from\s+(\S+)/i) headers['ufrom'] = ::Regexp.last_match(1) elsif (str =~ /\A[\r\n]*\z/ && (ignore_dquote || num_of_dquote.even?)) # separator between header and body break elsif (str =~ /\A\S/ && (ignore_dquote || num_of_dquote.even?)) # found body without separator buf.push(str) # rewind break elsif !current break elsif (str =~ /\A\s*=\?/) headers[current] += str.sub(/[\r\n]*\z/, '').sub(/\A\s*/, '') else headers[current] += str.sub(/[\r\n]*\z/, '').sub(/\A\s*/, ' ') end ## start count on from, to and cc line ## contiune count while number of dquote is odd if ((current =~ /\A(from|to|cc)\z/) || num_of_dquote.odd?) num_of_dquote = num_of_dquote + str.scan(/"/).length - str.scan(/\\"/).length end if (buf.empty? && ! ignore_dquote) # retry? ignore_dquote = true buf.concat(header_buf) header_buf.clear headers.clear end end if ((headers['content-type'] =~ /\bboundary=\s*"(.*?)"/i) || (headers['content-type'] =~ /\bboundary=\s*'(.*?)'/i) || (headers['content-type'] =~ /\bboundary=([^\s;]+)/i)) headers['boundary'] = ::Regexp.last_match(1) end headers['charset'] = ::Regexp.last_match(2) if (headers['content-type'] =~ /charset=(['"]*)([^\s\1;]+)\1/i) headers['content-type'] = ::Regexp.last_match(1) if (headers['content-type'] =~ /\A([^;]+)/) if (@options['max-line'] <= 0) return [headers, buf, lang] else return [headers, buf[0..@options['max-line']], lang] end end class Jtokenizer def initialize(method) case method when 'bigram' @method = proc { |s| bigram(s) } when 'block' @method = proc { |s| block(s) } when 'mecab' @method = proc { |s| mecab(s) } meishi_euc = "\xcc\xbe\xbb\xec".dup.force_encoding('ASCII-8BIT') meishi_sjis = meishi_euc.encode('SHIFT_JIS', 'EUC-JP').force_encoding('ASCII-8BIT') meishi_utf8 = meishi_euc.encode('UTF-8', 'EUC-JP').force_encoding('ASCII-8BIT') @m = MeCab::Tagger.new('-Ochasen') node = @m.parseToNode('this is a pen') if (defined?(MeCab::VERSION)) # defined after 0.90 hinshi = node.next.feature.force_encoding('ASCII-8BIT').split(/,/)[0] else hinshi = node.next.getFeature.force_encoding('ASCII-8BIT').split(/,/)[0] end case hinshi when meishi_euc @m_dic_enc = Encoding::EUC_JP when meishi_sjis @m_dic_enc = Encoding::SHIFT_JIS when meishi_utf8 @m_dic_enc = Encoding::UTF_8 else @m_dic_enc = Encoding.default_external end when 'chasen' Chasen.getopt('-F', '%H %m\n', '-j') @method = proc { |s| chasen(s) } when 'kakasi' @method = proc { |s| kakasi(s) } else raise "internal error: unknown method #{method}" end end def split(str) @method.call(str) end Reg_kanji = Regexp.compile("[\xb0\xa1-\xf4\xa4]+".dup.force_encoding('EUC-JP')) Reg_katakana = Regexp.compile("[\xa1\xbc\xa5\xa1-\xa5\xf6]+".dup.force_encoding('EUC-JP')) Reg_kanji_katakana = Regexp.compile("[\xb0\xa1-\xf4\xa4\xa1\xbc\xa5\xa1-\xa5\xf6]".dup.force_encoding('EUC-JP')) Reg_not_kanji_katakana = Regexp.compile("[^\xb0\xa1-\xf4\xa4\xa1\xbc\xa5\xa1-\xa5\xf6]".dup.force_encoding('EUC-JP')) def kakasi(str) str = str.gsub(/[\x00-\x7f]/, ' ') return [] if (str =~ /\A +\z/) array = [] Kakasi.kakasi('-oeuc -w', str).scan(/\S+/).each do |token| token.gsub!(Reg_not_kanji_katakana, '') array.push(token) if ((token =~ Reg_kanji) || (token.length > 2)) end return array end def mecab(str) str = str.encode(@m_dic_enc, invalid: :replace, undef: :replace, replace: ' ') str = str.gsub(/[\x00-\x7f]/, ' ') return [] if (str.empty? || str =~ /\A +\z/) array = [] node = @m.parseToNode(str) while (node && (defined?(MeCab::VERSION) || (node.hasNode == 1))) if defined?(MeCab::VERSION) token = node.surface.encode('EUC-JP', @m_dic_enc) hinshi = node.feature.encode('EUC-JP', @m_dic_enc).split(/,/)[0] else token = node.getSurface.encode('EUC-JP', @m_dic_enc) hinshi = node.getFeature.encode('EUC-JP', @m_dic_enc).split(/,/)[0] end unless token.valid_encoding? # Scrub token token = token.each_char.map { |c| c.valid_encoding? ? c : '' }.join end case hinshi when 'BOS/EOS' # Skip BOS/EOS when "\xb5\xad\xb9\xe6".dup.force_encoding('EUC-JP') # Skip KIGOU when "\xcc\xbe\xbb\xec".dup.force_encoding('EUC-JP') # MEISHI array.push(token) if ((token =~ Reg_kanji_katakana) || (token.bytesize > 2)) else token.gsub!(Reg_not_kanji_katakana, '') array.push(token) if ((token =~ Reg_kanji) || (token.bytesize > 2)) end node = node.next end return array end def chasen(str) str = str.gsub(/[\x00-\x7f]/, ' ') return [] if (str =~ /\A +\z/) array = [] Chasen.sparse(str).split("\n").each do |hinshi_token| next unless (hinshi_token =~ /(.*) (.*)/) hinshi = ::Regexp.last_match(1) token = ::Regexp.last_match(2) if (hinshi == "\xcc\xbe\xbb\xec") array.push(token) if ((token =~ Reg_kanji_katakana) || (token.length > 2)) else token.gsub!(Reg_not_kanji_katakana, '') array.push(token) if ((token =~ Reg_kanji) || (token.length > 2)) end end return array end def block(str) tokens = str.scan(Reg_kanji) tokens.concat(str.scan(Reg_katakana)) return tokens end def bigram(str) tokens = [] str.scan(Reg_kanji).each do |token| case token.length when 1, 2 tokens.push(token) else l = token.length - 1 (0..l).each do |i| tokens.push(token[i, 2]) end end end tokens.concat(str.scan(Reg_katakana)) return tokens end end def tokenize_headers(lang, headers) (lang,) = get_lang_from_headers(headers) if (! lang) head_db = TokenDB.new(lang) reg_token = Regexp.compile("\\b\\d[\\d\\.]+\\d\\b|[\\w#{@options['mark-in-token']}]+") if (headers['received']) str = headers['received'] str =~ /envelope-from\s+([\w@.-]+)/ efrom = ::Regexp.last_match(1) str =~ /for\s+<([\w@.-]+)>/ foraddress = ::Regexp.last_match(1) str.sub!(/(\bid|;).*/im, '') str.sub!(/\(qmail[^)]*\)/, '') str += ' ' + efrom if efrom str += ' ' + foraddress if foraddress headers['received'] = str end # if (headers["domainkey-signature"]) # headers["domainkey-signature"] = headers["domainkey-signature"].sub(/b=[^:;\s]+/, '') # end # "authentication-results", "domainkey-signature" headers.each do |header, content| next unless (@options['refer-all-header'] || @options['refer-header'][header]) if (lang == 'ja') if (content =~ /=\?utf-8\?([bq])/i) && (! @options['utf-8']) content = '' else content = NKF.nkf('-e -X -Z0', content.gsub(/\?(iso-2202-jp|shift-jis)\?/i, '?ISO-2022-JP?')).validate_encoding end else content = latin2ascii(content) end unless content.valid_encoding? # Scrub str content = content.each_char.map { |c| c.valid_encoding? ? c : '' }.join end content.scan(reg_token).each do |token| head_db.add_scalar(header, token, 1) if (token.length < 20) @options['message-fh'].printf("tokenizer %s %s\n", header, token.to_utf8) if (@options['debug']) end if (lang == 'ja') @jtokenizer.split(content.gsub(/\s+/, '')).each do |token| token.force_encoding('ASCII-8BIT') head_db.add_scalar(header, token, 1) @options['message-fh'].printf("tokenizer %s %s\n", header, token.to_utf8) if (@options['debug']) end end end return head_db end def tokenize_buf(buf) lang = nil # lang in unknown at first separators = [] delimiters = [] (headers, buf, lang) = get_headers(buf, lang) if headers.empty? # this is not a mail (db, buf) = tokenize_body(lang, headers, buf, separators, delimiters) db.time = Time.new db.language = Default_Language unless db.language ## db.language = Default_Language if (@options["unified-db"]) return db end body_db = TokenDB.new(lang) body_db.message_id = headers['message-id'] || '-' sub_head_db = TokenDB.new(lang) main_head_db = tokenize_headers(lang, headers) lang = main_head_db.language if main_head_db found_html_part = false plain_bodies = [] html_bodies = [] until buf.empty? separators.push('--' + headers['boundary']) if (headers['boundary']) delimiters.push('--' + headers['boundary'] + '--') if (headers['boundary']) if ((!headers['content-type']) || (headers['content-type'] !~ /rfc822/i)) (db, buf) = tokenize_body(lang, headers, buf, separators, delimiters) lang = db.language if (headers['content-type'] =~ /html/i) found_html_part = true html_bodies.push(db) else plain_bodies.push(db) end end (headers, buf, lang) = get_headers(buf, lang) db = tokenize_headers(lang, headers) sub_head_db.add_db(db) end html_bodies.each do |db| body_db.add_db(db) end unless (@options['ignore-plain-text-part'] && found_html_part) # default plain_bodies.each do |db| body_db.add_db(db) end end body_db.add_db(main_head_db) body_db.add_db(sub_head_db) body_db.file_count = 1 body_db.time = Time.new body_db.language = Default_Language unless body_db.language ## body_db.language = Default_Language if (@options["unified-db"]) return body_db end def i2eucjp(i) u2eucjp([i].pack('U')) end def i2ascii(i) latin2ascii(u2latin([i].pack('U'))) end def i2u(i) [i].pack('U') end def decode_character_reference2u(str) reg = Regexp.compile('\&\#(\d{1,5}|x[\da-f]{1,4});'.dup.force_encoding('ASCII-8BIT'), Regexp::IGNORECASE) newstr = if (@options['utf-8']) str.gsub(reg) do hex_or_dec = ::Regexp.last_match(1) if (hex_or_dec =~ /^x(.*)/i) hex_str = ::Regexp.last_match(1) i2u(hex_str.hex).force_encoding('ASCII-8BIT') else i2u(hex_or_dec.to_i).force_encoding('ASCII-8BIT') end end else str.gsub(reg, '') end return newstr end def decode_character_reference(str, lang) newstr = if (@options['utf-8']) str.gsub(/&\#(\d{1,5}|x[\da-f]{1,4});/i) do hex_or_dec = ::Regexp.last_match(1) if (hex_or_dec =~ /^x(.*)/i) hex_str = ::Regexp.last_match(1) if (lang == 'ja') i2eucjp(hex_str.hex) else i2ascii(hex_str.hex) end elsif (lang == 'ja') i2eucjp(hex_or_dec.to_i) else i2ascii(hex_or_dec.to_i) end end else str.gsub(/&\#(\d{1,5}|x[\da-f]{1,4});/i, '') end return newstr end def tokenize_str(str, lang) body_hash = DBHash.new(0) url_hash = DBHash.new(0) reg_token = Regexp.compile("(?:http:|www)[\\w\\-\\.\\/@%:\?=]+|[\\w\\-\\.]+@[\\w\\-\\.]+|\\b\\d[\\d\\.]+\\d\\b|[\\w#{@options['mark-in-token']}]+") reg_url = Regexp.compile('(^http:|https:|^www|@)') reg_token2 = Regexp.compile('\b\d[\d\.]+\d\b|[\w%]+') # reg_noret = Regexp::compile('[\r\n]*\z') unless str.valid_encoding? # Scrub str str = str.each_char.map { |c| c.valid_encoding? ? c : '' }.join end str.scan(reg_token).each do |token| if (token =~ reg_url) token.scan(reg_token2).each do |token2| if (token2.length < 20) url_hash[token2] += 1 @options['message-fh'].printf("tokenizer %s %s\n", 'url', token2.to_utf8) if (@options['debug']) end end elsif (token.length < 20) body_hash[token] += 1 @options['message-fh'].printf("tokenizer C %s %s\n", 'body', token.to_utf8) if (@options['debug']) end end if (lang == 'ja') str = str.gsub(Regexp.compile("^[ -\\~]*[\|\>]+".dup.force_encoding('EUC-JP')), '') str.gsub!(Regexp.compile("^[ \\t\xa1\xa1]+".dup.force_encoding('EUC-JP')), '') # delete white space str.gsub!(Regexp.compile('(\\r?\\n){2,}'.dup.force_encoding('EUC-JP')), ' ') # keep multiple newline as space str.gsub!(Regexp.compile('[\\r\\n]+'.dup.force_encoding('EUC-JP')), '') # delete newline str.split.each do |s| @jtokenizer.split(s).each do |token| token.force_encoding('ASCII-8BIT') body_hash[token] += 1 @options['message-fh'].printf("tokenizer ja %s %s\n", 'body', token.to_utf8) if (@options['debug']) end end end return [body_hash, url_hash] end def base64_encoded?(buf) [buf.dup, buf.reverse].each do |b| while (str = b.shift) # if (str =~ /\A[\.\s\r\n]*\z/) if (str =~ /\A[.\s]*\z/) next elsif (str =~ %r{\A[A-z0-9=+/]+\s*\z}) break else return false end end end return true end def tokenize_body(lang, headers, body, separators, delimiters) reg_return_codes = Regexp.compile('[\r\n]*\z') db = TokenDB.new(lang) body = body.dup buf = [] delimiter = delimiters.last separator = separators.last if separators.empty? buf = body body = [] else while (str = body.shift) str_noret = str.sub(reg_return_codes, '') case str_noret when separator break when delimiter delimiters.pop separators.pop delimiter = delimiters.last separator = separators.last break else buf.push(str) end end end if (headers['content-type'] && headers['content-type'] !~ /text/i) return [db, body] # skip non-text body end case headers['content-transfer-encoding'] when /base64/i if base64_encoded?(buf) ## buf.map! {|str| str.unpack("m*").to_s} buf = buf.join.gsub(/[\r\n]/, '').unpack('m*') end when /quoted-printable/i buf.map! { |str| str.unpack('M*').join } end lang_backup = lang if (headers['content-type'] =~ /html/i) (lang, code) = get_lang_from_buf(buf, true) else (lang, code) = get_lang_from_buf(buf, false) end lang ||= lang_backup str = buf.join str.gsub!(/^begin[^\r\n]+(([\r\n]+M)([^\r\n]+))*/, '') # remove uuencoded lines if (lang == 'ja') if (code == 'utf8') if (@options['utf-8']) str = u2eucjp(str) else lang = Default_Language # can't use iconv / stop ja tokenizer end elsif (code == 'gb18030') if (@options['utf-8']) str = gb180302eucjp(str) else lang = Default_Language end else str = NKF.nkf('-e -X -Z0', str).validate_encoding end else str = latin2ascii(str) end tags = [] if (headers['content-type'] =~ /html/i) # remove salad at head of part encoding = str.encoding str.force_encoding('ASCII-8BIT') if (str =~ Regexp.compile('\A[^<>]*?(<(\?xml|!doctype|html|body)\b.*)\z', Regexp::MULTILINE | Regexp::IGNORECASE)) str = ::Regexp.last_match(1) end # remove salad in head, except style if (str =~ /\A(.*?)([^<>]*<(?!/style)}im, '><') str = before_body_tag + after_body_tag end # remove

str.gsub!(%r{(]*display\s*:\s*none[^>]*>)([^<>]*)()}im, '') str = ::Regexp.last_match(1) if (@options['ignore-after-last-atag']) && (str =~ %r{\A(.*)}im) # remove salad after body or html if (str =~ Regexp.compile('\A(.*)[^<>]*?\z', Regexp::MULTILINE | Regexp::IGNORECASE)) str = ::Regexp.last_match(1) end if (str =~ Regexp.compile('\A(.*)[^<>]*?\z', Regexp::MULTILINE | Regexp::IGNORECASE)) str = ::Regexp.last_match(1) end str.gsub!(Regexp.compile('<[^>]*>', Regexp::MULTILINE)) do |t| t = t.gsub(/\n/, '') if (t =~ RE_ALL_TAGS) # end tags are thrown away t.force_encoding(encoding) tags.push(t) end t.force_encoding('ASCII-8BIT') t.force_encoding(encoding) if (t =~ RE_SPACE_TAGS) ' ' else '' end end str.force_encoding(encoding) body_str = decode_character_reference(str, lang) # out of tags tag_str = decode_character_reference(tags.join, lang) # in tags else # if plain text body_str = str tag_str = '' end (body_hash, url_body_hash) = tokenize_str(body_str, lang) (tag_hash, url_tag_hash) = tokenize_str(tag_str, lang) db.add_hash({ 'body' => body_hash }) if (!body_hash.empty? && @options['use-body']) db.add_hash({ 'tag' => tag_hash }) unless tag_hash.empty? db.add_hash({ 'url' => url_body_hash }) unless url_body_hash.empty? db.add_hash({ 'url' => url_tag_hash }) unless url_tag_hash.empty? db.file_count = 1 db.language = lang return [db, body] end # for each lang class Probability def initialize(options, lang) @options = options @filename = @options['homedir'] + lang + Prob_ext case (@options['db']) when 'ndbm' @clean = TokenNDBM.new(@options, lang, Clean_ext) @spam = TokenNDBM.new(@options, lang, Spam_ext) @prob = TokenNDBM.new(@options, lang, Prob_ext) when 'sdbm' @clean = TokenSDBM.new(@options, lang, Clean_ext) @spam = TokenSDBM.new(@options, lang, Spam_ext) @prob = TokenSDBM.new(@options, lang, Prob_ext) when 'gdbm' @clean = TokenGDBM.new(@options, lang, Clean_ext) @spam = TokenGDBM.new(@options, lang, Spam_ext) @prob = TokenGDBM.new(@options, lang, Prob_ext) when 'bdb1' @clean = TokenBDB1.new(@options, lang, Clean_ext) @spam = TokenBDB1.new(@options, lang, Spam_ext) @prob = TokenBDB1.new(@options, lang, Prob_ext) when 'bdb' @clean = TokenBDB.new(@options, lang, Clean_ext) @spam = TokenBDB.new(@options, lang, Spam_ext) @prob = TokenBDB.new(@options, lang, Prob_ext) when 'qdbm' @clean = TokenQDBM.new(@options, lang, Clean_ext) @spam = TokenQDBM.new(@options, lang, Spam_ext) @prob = TokenQDBM.new(@options, lang, Prob_ext) end @language = lang end attr_accessor :prob, :clean, :spam, :spam_cutoff, :language def merge_dbs_of_lang(token_dbs) new_db = TokenDB.new token_dbs.each do |db| new_db.add_db(db) if (@language == db.language) end return new_db end end class Graham < Probability def initialize(options, lang) @spam_cutoff = 0.9 @default_probability = 0.4 super end def product(a) n = 1 a.each do |v| n *= v if (v != 0) end return n end def get_combined_probability(token_db) prob_db = TokenDB.new # temporary token_db.each_ct do |category, token| probability = @prob.value_with_degene(category, token) if probability prob_db.set_scalar(category, token, probability) else prob_db.set_scalar(category, token, @default_probability) # 0.4 end end probs = prob_db.values.sort { |a, b| (b - 0.5).abs <=> (a - 0.5).abs }[0, 15] if (@options['debug']) prob_array = [] prob_db.each_ct do |c, t| prob_array.push([[c, t], prob_db.value(c, t)]) end prob_array.sort! { |a, b| (b[1] - 0.5).abs <=> (a[1] - 0.5).abs } prob_array = prob_array[0, 15] prob_array.sort! { |a, b| b[1] <=> a[1] } prob_array.each do |k, v| @options['message-fh'].printf("word probability %s %s %f\n", k[0], k[1].to_str, v) end end prod = product(probs) token_db.probability = prod / (prod + product(probs.map { |x| 1 - x })) token_db.spam_flag = if (token_db.probability > @spam_cutoff) true else false end return token_db end def update_probability(token_dbs) c_count = [@clean.file_count, 1].max s_count = [@spam.file_count, 1].max if token_dbs.empty? incremental = false target_cts = @clean.key_cts | @spam.key_cts @prob.open('w') @prob.clear else incremental = true merged_db = merge_dbs_of_lang(token_dbs) target_cts = merged_db.key_cts return if target_cts.empty? @prob.open('rw') end old_file_count = @prob.file_count new_file_count = 0 cnum = c_count.to_f snum = s_count.to_f target_cts.each do |(category, token)| c_count = @clean.value(category, token) || 0 s_count = @spam.value(category, token) || 0 if (incremental && @prob.value(category, token)) @prob.sub_scalar(category, token, 1.0) # 1.0 is big enough for delete new_file_count -= 1 end if c_count.zero? if (s_count > 10) new_file_count += 1 @prob.set_scalar(category, token, 0.9999) elsif (s_count > 5) new_file_count += 1 @prob.set_scalar(category, token, 0.9998) end elsif s_count.zero? if (c_count > 10) new_file_count += 1 @prob.set_scalar(category, token, 0.0001) elsif (c_count > 5) new_file_count += 1 @prob.set_scalar(category, token, 0.0002) end elsif (c_count + s_count > 5) c = c_count * 2 s = s_count p = [[[s / snum, 1.0].min / ([c / cnum, 1.0].min + [s / snum, 1.0].min), 0.9999].min, 0.0001].max new_file_count += 1 @prob.set_scalar(category, token, p) end end @prob.file_count = new_file_count + old_file_count if incremental @prob.close end end class Robinson < Probability def initialize(options, lang) @robx_max = 1 @min_dev = 0.1 @spam_cutoff = 0.582 @center = 0.5 @robs = 0.001 # from bogofilter/robinson.h @default_robx = 0.415 # from bogofilter/robinson.h / not used super end def get_pw(_category, _token, _g, _b) return pw end def update_probability(token_dbs) pwdb = TokenDB.new c_count = [@clean.file_count, 1].max s_count = [@spam.file_count, 1].max if token_dbs.empty? incremental = false target_cts = @clean.key_cts | @spam.key_cts else incremental = true merged_db = merge_dbs_of_lang(token_dbs) target_cts = merged_db.key_cts return if target_cts.empty? end ## loop1 ## get pw and robx(average of pw) count = 0 pw_sum = 0.0 good_mail = [1, @clean.file_count].max.to_f bad_mail = [1, @spam.file_count].max.to_f target_cts.each do |(category, token)| g = [@clean.value(category, token) || 0, c_count].min b = [@spam.value(category, token) || 0, s_count].min n = g + b if n.zero? pwdb.set_scalar(category, token, nil) # need to delete this token from prob.db else pw = (b / bad_mail) / (b / bad_mail + g / good_mail) if (@robx_max.zero? || (n <= @robx_max)) pw_sum += pw count += 1 end pwdb.set_scalar(category, token, pw) end end if incremental @prob.open('rw') old_file_count = @prob.file_count old_robx = @prob.value('.internal', 'robx') || @default_robx robx = (pw_sum + old_file_count * old_robx) / (count + old_file_count) else @prob.open('w') @prob.clear robx = if (count != 0) pw_sum / count else @default_robx end end robs = @robs ## loop2 ## get fw from pw new_file_count = 0 pwdb.key_cts.each do |(category, token)| g = [@clean.value(category, token) || 0, c_count].min b = [@spam.value(category, token) || 0, s_count].min n = g + b pw = pwdb.value(category, token) if (incremental && @prob.value(category, token)) new_file_count -= 1 @prob.sub_scalar(category, token, 1.0) # 1.0 is big enough for delete end if pw new_file_count += 1 @prob.set_scalar(category, token, (robs * robx + n * pw) / (robs + n)) # fw end end @prob.set_scalar('.internal', 'robx', robx) @prob.file_count = new_file_count + old_file_count if incremental @prob.close end def get_probability(pminus, qminus, count) r = 1.0 / [1, count].max p = 1.0 - Math.exp(pminus.ln * r) q = 1.0 - Math.exp(qminus.ln * r) s = (1.0 + (p - q) / (p + q)) / 2.0 return s end def get_combined_probability(token_db) robx = @prob.value('.internal', 'robx') || @default_robx count = 0 pminus = FLOAT.new(1) qminus = FLOAT.new(1) token_db.each_ct do |category, token| probability = @prob.value_with_degene(category, token) || robx next unless ((probability - @center).abs > @min_dev) if (probability <= 0.0) probability = 0.0000001 elsif (probability >= 1.0) probability = 0.9999999 end c = token_db.value(category, token) count += c pminus *= FLOAT.new(1.0 - probability, c) qminus *= FLOAT.new(probability, c) if (@options['debug']) @options['message-fh'].printf("word probability %s %s %d %f\n", category, token.to_utf8, c, probability) end end token_db.probability = if count.zero? 0.0 else get_probability(pminus, qminus, count) end token_db.spam_flag = if (token_db.probability > @spam_cutoff) true else false end return token_db end end class RobinsonFisher < Robinson def initialize(options, lang) super @spam_cutoff = 0.95 end def chi2q(x2, v) m = x2 / 2.0 sum = Math.exp(0.0 - m) term = FLOAT.new term.exp = 0.0 - m term.mant = 1 (1..(v / 2) - 1).each do |i| term *= FLOAT.new(m / i) sum += term.to_f end return sum < 1.0 ? sum : 1.0 end def get_probability(pminus, qminus, count) p = 1 - chi2q(-2.0 * pminus.ln, 2 * count) q = 1 - chi2q(-2.0 * qminus.ln, 2 * count) s = (1.0 + p - q) / 2.0 return s end end def init_dir(dir) return if FileTest.directory?(dir) Dir.mkdir(dir, 0o700) end def soft_raise(str = nil) warn str if str warn "Try `#{File.basename($PROGRAM_NAME)} --help' for more information." exit 2 end def usage print <<~EOM NAME #{File.basename($PROGRAM_NAME)} - bayesian spam filter SYNOPSIS #{File.basename($PROGRAM_NAME)} [options] [commands] < MAIL #{File.basename($PROGRAM_NAME)} [options] [commands] MAIL ... DESCRIPTION filter spam. If commands are specified, bsfilter is in maintenance mode, otherwise it is in filtering mode. If bsfilter does not find spam in filtering mode, exit status is 1. If bsfilter runs with --pipe option or finds spam, exit status is 0. COMMANDS --add-clean|-c add mails into the clean token database --add-spam|-s add mails into the spam token database --sub-clean|-C subtract mails from the clean token database --sub-spam|-S subtract mails from the spam token database --update|-u update the probability table from clean and spam token databases --export-clean export the clean token database --export-spam export the spam token database --import-clean import the clean token database --import-spam import the spam token database --export-probability export the probability database (for debugging purpose) OPTIONS --homedir directory specify the name of the bsfilter\'s home directory If this option is not used, a directory specified with the environment variable "BSFILTERHOME" is used If the variable "BSFILTERHOME" is not defined, ".bsfilter" directory under your home is used If the variable "HOME" is not defined, a directory which bsfilter is located at is used --config-file file specify the name of the bsfilter\'s configuration file "bsfilter.conf" in bsfilter\'s home directory is used by default --max-line number check and/or study the first number of lines default is #{Default_max_line}. 0 means all --db ndbm|sdbm|gdbm|bdb1|bdb|qdbm specify the name of database type "sdbm" by default --jtokenizer|-j bigram|block|mecab|chasen|kakasi specify algorithm of a tokenizer for Japanese language "bigram" by default --list-clean print filename of clean mail --list-spam print filename of spam --imap access IMAP server --imap-server hostname specify hostname of IMAP server --imap-port number specify port number of IMAP server. default is #{Default_imap_port} --imap-auth method specify authorization method. default is "auto" "cram-md5" use "AUTHENTICATE CRAM-MD5" command "login" use "AUTHENTICATE LOGIN" command "loginc" use "LOGIN" command "auto" try #{Default_imap_auth_preference.join(', ')} in this order.#{' '} --imap-user name specify user name of IMAP server --imap-password password specify password of imap-user --imap-folder-clean folder specify destination folder for clean mails. "inbox.clean" for example --imap-folder-spam folder specify destination folder for spams. "inbox.spam" for example --imap-fetch-unseen filter or study mails without SEEN flag --imap-fetch-unflagged filter or study mails without "X-Spam-Flag" header --imap-reset-seen-flag reset SEEN flag when bsfilter moves or modifies mails --pop work as POP proxy --pid-file file specify filename for logging process ID of bsfilter "bsfilter.pid" in bsfilter\'s home directory is used by default this function is valid when "--pop" is specified --tasktray sit in tasktray this is valid with "--pop" on VisualuRuby --pop-server hostname specify hostname of POP server --pop-port number specify port number of POP server. default is #{Default_pop_port} --pop-proxy-if address specify address of interface which bsfilter listens at default is 0.0.0.0 and all interfaces are active --pop-proxy-port number specify port number which bsfilter listens at. default is #{Default_pop_proxy_port} --pop-user name optional. specify username of POP server. bsfilter checks match between value of this options and a name which MUA sends. in case of mismatch, bsfilter closes sockets. --pop-proxy-set set[,set...] specify rules of pop proxy. alternative way of pop-server, pop-port, pop-proxy-port and pop-user option. format of "set" is "pop-server:pop-port:[proxy-interface]:proxy-port[:pop-user]" If proxy-interface is specified and isn\'t 0.0.0.0 , other interfaces are not used. "--pop-proxy-set 192.168.1.1:110::10110" is equivalent with "--pop-server 192.168.1.1 --pop-port 110 --pop-proxy-port 10110" --pop-max-size number When mail is longer than the specified number, the mail is not filtered. When 0 is specified, all mails are tested and filtered. unit is byte. default is #{Default_pop_max_size} --ssl use POP over SSL with --pop option use IMAP over SSL with --imap option --ssl-cert filename|dirname specify a filename of a certificate of a trusted CA or a name of a directory of certificates --method|-m g|r|rf specify filtering method. "rf" by default "g" means Paul Graham method, "r" means Gary Robinson method, and "rf" means Robinson-Fisher method --spam-cutoff number specify spam-cutoff value 0.9 by default for Paul Graham method 0.582 by default for Gary Robinson method 0.95 by default for Robinson-Fisher method --auto-update|-a recognize mails, add them into clean or spam token database and update the probability table --disable-degeneration|-D disable degeneration during probability table lookup --disable-utf-8 disable utf-8 support --refer-header header[,header...] refer specified headers of mails "#{Default_refer_header}" by default --refer-all-header refer all headers of mails --ignore-header|-H ignore headers of mails same as --refer-header "" --ignore-body|-B ignore body of mails, except URL or mail address --ignore-plain-text-part ignore plain text part if html part is included in the mail --ignore-after-last-atag ignore text after last "A" tag --mark-in-token "characters" specify characters which are allowable in a token "#{Default_mark_in_token}" by default --show-process show summary of execution --show-new-token show tokens which are newly added into the token database --mbox use "unix from" to divide mbox format file --max-mail number reduce token database when the number of stored mails is larger than this one #{Default_max_mail} by default --min-mail number reduce token database as if this number of mails are stored #{Default_min_mail} by default --pipe write a mail to stdout. this options is invalid when "--imap" or "--pop" is specified --insert-revision insert "X-#{Default_header_prefix}-Revision: bsfilter release..." into a mail --insert-flag insert "X-#{Default_header_prefix}-Flag: Yes" or "X-#{Default_header_prefix}-Flag: No" into a mail --insert-probability insert "X-#{Default_header_prefix}-Probability: number" into a mail --header-prefix string valid with --insert-flag and/or --insert-probability option insert "X-specified_string-..." headers, instead of "#{Default_header_prefix}" --mark-spam-subject insert "#{Default_spam_subject_prefix}" at the beginning of Subject header --spam-subject-prefix string valid with --mark-spam-subject option insert specified string, instead of "#{Default_spam_subject_prefix}" --show-db-status show numbers of tokens and mails in databases and quit --help|-h help --quiet|-q quiet mode --verbose|-v verbose mode --debug|-d debug mode EXAMPLES % bsfilter -s ~/Mail/spam/* ## add spam % bsfilter -u -c ~/Mail/job/* ~/Mail/private/* ## add clean mails and update probability table % bsfilter ~/Mail/inbox/1 ## show spam probability ## recipe of procmail (1) :0 HB * ? bsfilter -a spam/. ## recipe of procmail (2) :0 fw | bsfilter -a --pipe --insert-flag --insert-probability :0 * ^X-Spam-Flag: Yes spam/. LICENSE this file is distributed under GPL version2 RELEASE #{Release} EOM end class Mbox def initialize(options, fh) @options = options @buf = fh.readlines return unless ((@buf.length == 1) && (@buf.last =~ /\r\z/)) # Mac style EOL @buf = @buf.last.scan(/.*?\r/) end def read return nil if @buf.empty? # EOF if (!@options['mbox']) # one file == one mail ret_buf = @buf.dup @buf.clear else ## reg_ufrom = Regexp::compile('^From .*@.* \d{2}:\d{2}:\d{2} ') ret_buf = [] while (str = @buf.shift) if (str =~ /^From /) if ret_buf.empty? # head of mail ret_buf.push(str) else # head of next mail @buf.unshift(str) # rewind return ret_buf end else ret_buf.push(str) end end end return ret_buf end end def update_token_db_one(db, command = @options) maintenance_command = '' maintenance_command += 'c' if (command['add-clean']) maintenance_command += 's' if (command['add-spam']) maintenance_command += 'C' if (command['sub-clean']) maintenance_command += 'S' if (command['sub-spam']) maintenance_command = '-' if (maintenance_command == '') show_process(db, maintenance_command) if (@options['show-process']) if (command['add-clean'] || command['import-clean']) @db_hash[db.language].clean.show_new_token(db) if (@options['show-new-token']) @db_hash[db.language].clean.add_db(db) end if (command['add-spam'] || command['import-spam']) @db_hash[db.language].spam.show_new_token(db) if (@options['show-new-token']) @db_hash[db.language].spam.add_db(db) end @db_hash[db.language].clean.sub_db(db) if (command['sub-clean']) return unless (command['sub-spam']) @db_hash[db.language].spam.sub_db(db) end def read_exported_text(fh) dbs = DBHash.new @options['languages'].each do |lang| dbs[lang] = TokenDB.new(lang) dbs[lang].time = Time.new end while (str = fh.gets) str.chomp! next if (str =~ /^\s*#/) (lang, category, token, val) = str.split val = val.to_f.to_i if (category == '.internal') dbs[lang].file_count = dbs[lang].file_count + val if (token == 'file_count') else dbs[lang].add_scalar(category, token, val) dbs[lang].file_count = dbs[lang].file_count - 1 end end return dbs end def update_token_dbs(files) dbs = [] @options['languages'].each do |lang| @db_hash[lang].clean.open('rw') @db_hash[lang].spam.open('rw') end if (@options['imap']) if (@options['ssl']) imap = Net::IMAP.new(@options['imap-server'], port: @options['imap-port'], ssl: {cert: @options['ssl-cert']}) else imap = Net::IMAP.new(@options['imap-server'], port: @options['imap-port']) end imap.auto_authenticate(@options, @options['imap-auth'], @options['imap-user'], @options['imap-password'], @options['imap-auth-preference']) files.each do |mailbox| target_mailbox = mailbox target_mailbox = @options['imap-folder-clean'] if (@options['add-clean'] && @options['imap-folder-clean']) target_mailbox = @options['imap-folder-spam'] if (@options['add-spam'] && @options['imap-folder-spam']) uids = imap_get_target_uids(imap, mailbox) uids.each do |uid| imapm = IMAPMessage.new(@options, imap, uid) imapm.fetch_rfc822 db = tokenize_buf(imapm.buf) db.filename = uid update_token_db_one(db) updated = imapm.insert_rfc822_headers!((@options['add-spam'] || @options['sub-clean']), nil) if updated imapm.append(target_mailbox) imapm.set_delete_flag elsif (target_mailbox != mailbox) imapm.copy(target_mailbox) imapm.set_delete_flag end end imap.close end imap.logout else files.each do |file| open_ro(file) do |fh| if (@options['import-clean'] || @options['import-spam']) imported_dbs = read_exported_text(fh) imported_dbs.each do |_lang, db| update_token_db_one(db) end else mbox = Mbox.new(@options, fh) while (buf = mbox.read) db = tokenize_buf(buf) db.filename = file dbs.push(db) if (@options['pipe']) insert_headers!(buf, (@options['add-spam'] || @options['sub-clean']), nil) @options['pipe-fh'].print buf.join end update_token_db_one(db) end end end end end slimed = false @options['languages'].each do |lang| slimed |= @db_hash[lang].clean.check_size(@options['max-mail'], @options['min-mail']) slimed |= @db_hash[lang].spam.check_size(@options['max-mail'], @options['min-mail']) @db_hash[lang].clean.close @db_hash[lang].spam.close end dbs.clear if slimed # disable incremental return dbs end def auto_update(token_dbs) command = {} updated_langs = [] token_dbs.each do |token_db| updated_langs.push(token_db.language) end updated_langs.uniq.each do |lang| @db_hash[lang].clean.open('rw') @db_hash[lang].spam.open('rw') end command['sub-clean'] = false command['sub-spam'] = false command['import-clean'] = false command['import-spam'] = false token_dbs.each do |token_db| if token_db.spam_flag command['add-clean'] = false command['add-spam'] = true else command['add-clean'] = true command['add-spam'] = false end update_token_db_one(token_db, command) end slimed = false updated_langs.uniq.each do |lang| slimed |= @db_hash[lang].clean.check_size(@options['max-mail'], @options['min-mail']) slimed |= @db_hash[lang].spam.check_size(@options['max-mail'], @options['min-mail']) end token_dbs.clear if slimed # can't use incremental mode updated_langs.uniq.each do |lang| @db_hash[lang].update_probability(token_dbs) @db_hash[lang].clean.close @db_hash[lang].spam.close end end def read_config_file(file) configs = [] open(file) do |fh| while (str = fh.gets) next if ((str =~ /\A\s*#/) || (str =~ /\A\s*\z/)) str.chomp! str.sub!(/\s+\z/, '') str.sub!(/\A\s+/, '') tokens = str.split(/\s+/, 2) unless tokens.empty? tokens[0] = '--' + tokens[0] configs.concat(tokens) end end end return configs end def imap_get_target_uids(imap, mailbox) if (mailbox =~ %r{(.*)/(.*)}) mailbox = ::Regexp.last_match(1) seqs = ::Regexp.last_match(2) else seqs = nil end imap.select(mailbox) uids = if (@options['imap-fetch-unseen']) if seqs imap.uid_search(['UNSEEN', seqs]) else imap.uid_search(['UNSEEN']) end elsif seqs imap.uid_search([seqs]) else imap.uid_search(['ALL']) end if (@options['imap-fetch-unflagged']) yes = imap.uid_search(['HEADER', x_spam_flag.sub(/:$/, ''), 'Yes']) no = imap.uid_search(['HEADER', x_spam_flag.sub(/:$/, ''), 'No']) if (@options['verbose']) @options['message-fh'].printf("imap-fetch-unflagged working original %d Yes %d No %d\n", uids.length, yes.length, no.length) end ## uids = uids - imap.uid_search(["HEADER", x_spam_flag.sub(/:$/, ''), ""]) ## Sendmail Advanced Message Server returns all mails when search string is zero-length ??? uids = uids - yes - no if (@options['verbose']) @options['message-fh'].printf("imap-fetch-unflagged worked %d\n", uids.length) end end return uids end class IMAPMessage include Bsutil def initialize(options, imap, uid = nil) @options = options @seqno = nil @seen = nil @uid = uid @imap = imap @buf = [] end attr_accessor :seqno, :uid, :imap, :buf, :seen def fetch_rfc822 # @options["message-fh"].printf("fetch_rfc822 %d\n", @uid) if (@options["verbose"]) fetched = @imap.uid_fetch(@uid, %w[RFC822 FLAGS]) @seqno = fetched[0].seqno @buf = fetched[0].attr['RFC822'].split("\n") @seen = fetched[0].attr['FLAGS'].include?(:Seen) return if @seen @imap.uid_store(@uid, '-FLAGS', [:Seen]) end def insert_rfc822_headers!(*args) return insert_headers!(@buf, *args) end def insert_rfc822_header!(header, content) # @options["message-fh"].printf("insert_rfc822_header %d %s %s\n", @uid, header, content) if (@options["verbose"]) insert_header!(@buf, header, content) end def append(mailbox) @buf.map! do |str| str.sub(/[\r\n]*\z/, "\r\n") end # @options["message-fh"].printf("append %d %s\n", @uid, mailbox) if (@options["verbose"]) if @seen @imap.append(mailbox, @buf.join, [:Seen]) else @imap.append(mailbox, @buf.join, []) end end def copy(mailbox) # @options["message-fh"].printf("copy %d %s\n", @uid, mailbox) if (@options["verbose"]) @imap.uid_copy(@uid, mailbox) end def set_delete_flag # @options["message-fh"].printf("set_delete_flag %d\n", @uid) if (@options["verbose"]) @imap.uid_store(@uid, '+FLAGS', [:Deleted]) end def reset_seen_flag # @options["message-fh"].printf("reset_seen_flag %d\n", @uid) if (@options["verbose"]) @seen = false @imap.uid_store(@uid, '-FLAGS', [:Seen]) end end def socket_send_rec(command, socket) buf = [] if command if (@options['debug']) @options['message-fh'].printf('send %s %s', socket, command.sub(/\APASS.*/i, 'PASS ********')) end socket.write_timeout(command) # pass command to pop-server end response = socket.gets_timeout # get response from pop-server buf.push(response) if (@options['debug']) @options['message-fh'].printf('resp %s %s', socket, response.sub(/\APASS.*/i, 'PASS ********')) end if ((response =~ /\A\+OK/) && ((command =~ /\A(RETR|TOP|CAPA)/i) || (command =~ /\A(UIDL|LIST)[^\d]*\z/i))) while (response != ".\r\n") response = socket.gets_timeout buf.push(response) end end return buf end def pop_proxy_multi(pop_proxy_sets) trap('SIGINT') do @options['message-fh'].printf("SIGINT received\n") if (@options['verbose']) @threads.each do |thread| # kill child threads Thread.kill(thread) end end pop_proxy_sets.split(/,/).each do |pop_proxy_set| (pop_server, pop_port, pop_proxy_if, pop_proxy_port, pop_user) = pop_proxy_set.split(/:/) pop_port = Default_pop_port if (!pop_port || pop_port == '') pop_proxy_if = Default_pop_proxy_if if (!pop_proxy_if || pop_proxy_if == '') pop_proxy_port = Default_pop_proxy_port if (!pop_proxy_port || pop_proxy_port == '') t = Thread.start do # start child threads pop_proxy_one(pop_server, pop_port, pop_proxy_if, pop_proxy_port, pop_user) end @threads.push(t) end @threads.each(&:join) Thread.list.each do |t| # join grandchild threads t.join if (t != Thread.current) end return 0 end def pop_bypass_large_mail(command, pop_socket, pop_proxy_socket) pop_socket.write_timeout(command) # RETR to server str = pop_socket.gets_timeout # response from server pop_proxy_socket.write_timeout(str) # forward return if (str =~ /^\A-ERR/) while (str != ".\r\n") require 'timeout' Timeout.timeout(SOCKET_TIMEOUT) do pop_proxy_socket.write(str = pop_socket.gets) # forward end end return end def snoop_list_response(strs) h = DBHash.new if (strs[0] =~ /\A\+OK\s*(\d+)\s+(\d+)/) h[::Regexp.last_match(1)] = ::Regexp.last_match(2).to_i else strs.each do |str| h[::Regexp.last_match(1)] = ::Regexp.last_match(2).to_i if (str =~ /^(\d+)\s+(\d+)/) end end return h end def pop_proxy_one(pop_server, pop_port, pop_proxy_if, pop_proxy_port, pop_user) gs = TCPServer.open(pop_proxy_if, pop_proxy_port) addr = gs.addr addr.shift @options['message-fh'].printf("pop_proxy is on %s\n", addr.join(':')) if (@options['verbose']) loop do Thread.start(gs.accept) do |pop_proxy_socket| # start grandchild threads @options['message-fh'].print(pop_proxy_socket, " is accepted\n") if (@options['verbose']) begin pop_socket = nil Timeout.timeout(SOCKET_TIMEOUT) do pop_socket = TCPSocket.open(pop_server, pop_port) end @options['message-fh'].print(pop_socket, " is connected\n") if (@options['verbose']) pop_socket = get_ssl_socket(pop_socket, @options['ssl-cert']) if (@options['ssl']) hello = socket_send_rec(nil, pop_socket)[0] hello.sub!(/(.*)\r/, "\\1(pop_proxy by bsfilter)\r") pop_proxy_socket.write(hello) sizes = DBHash.new while (command = socket_send_rec(nil, pop_proxy_socket)[0]) # get command from MUA if (command =~ /\ARETR\s+(\d+)/i) n = ::Regexp.last_match(1) if (sizes[n] && (@options['pop-max-size']).positive? && (@options['pop-max-size'] < sizes[n])) pop_bypass_large_mail(command, pop_socket, pop_proxy_socket) next end end response = socket_send_rec(command, pop_socket) if (command =~ /\ALIST/i) sizes.update(snoop_list_response(response)) elsif ((command =~ /\A(TOP|RETR)/i) && (response[0] =~ /\A\+OK/)) buf = response[1..].dup token_db = tokenize_buf(buf) @db_hash[token_db.language].prob.open('r') @db_hash[token_db.language].get_combined_probability(token_db) @db_hash[token_db.language].prob.close if (@options['auto-update']) auto_update([token_db]) elsif (@options['show-process']) show_process(token_db, '-') end @options['message-fh'].printf("combined probability %f\n", token_db.probability) if (@options['verbose']) insert_headers!(buf, token_db.spam_flag, token_db.probability) response[1..-1] = buf end # don't use elsif if (command =~ /QUIT/i) @options['message-fh'].printf('send %s %s', pop_proxy_socket, response[0]) if (@options['debug']) pop_proxy_socket.write(response.join) # return response to MUA break elsif ((command =~ /\AUSER\s*(\S*)\r/) && (pop_user && pop_user != ::Regexp.last_match(1))) @options['message-fh'].printf("username unmatch error\n") pop_proxy_socket.write("-ERR unregistered user\r\n") # return response to MUA break else @options['message-fh'].printf('send %s %s', pop_proxy_socket, response[0]) if (@options['debug']) pop_proxy_socket.write(response.join) # return response to MUA end end rescue TimeoutError if (@options['verbose']) @options['message-fh'].printf("Timeout error %s %s %s\n", pop_server, pop_port, pop_proxy_port) end rescue if (@options['verbose']) @options['message-fh'].printf("pop exception caught %s %s %s\n", pop_server, pop_port, pop_proxy_port) end @options['message-fh'].puts($ERROR_INFO.inspect) if (@options['verbose']) @options['message-fh'].puts($ERROR_POSITION) if (@options['debug']) ensure if (pop_proxy_socket && !pop_proxy_socket.closed?) @options['message-fh'].print(pop_proxy_socket, " is gone\n") if (@options['verbose']) pop_proxy_socket.close end if (pop_socket && !pop_socket.closed?) @options['message-fh'].print(pop_socket, " is gone\n") if (@options['verbose']) pop_socket.close end end end end end def check_options_for_pop!(options) options['icon_number'] = (options['icon-number'] || Default_icon_number).to_i options['pop-port'] = Default_pop_port unless (options['pop-port']) options['pop-proxy-if'] = Default_pop_proxy_if unless (options['pop-proxy-if']) options['pop-proxy-port'] = Default_pop_proxy_port unless (options['pop-proxy-port']) options['pop-max-size'] = (options['pop-max-size'] || Default_pop_max_size).to_i if (options['tasktray']) require('vr/vrcontrol') require('vr/vrtray') end if (options['pop-proxy-set'] || options['pop-server']) ## ok else soft_raise("#{$PROGRAM_NAME}: pop-server unspecified") end return end def check_options_for_imap!(options) error = false options['imap-port'] = Default_imap_port unless (options['imap-port']) %w[imap-server imap-auth imap-user imap-password].each do |name| unless (options[name]) printf("specify %s\n", name) error = true end end raise 'error found in imap options' if error return end def do_imap(command_line_args, token_dbs) ret_code = CODE_CLEAN if (@options['ssl']) imap = Net::IMAP.new(@options['imap-server'], port: @options['imap-port'], ssl: {cert: @options['ssl-cert']}) else imap = Net::IMAP.new(@options['imap-server'], port: @options['imap-port']) end imap.auto_authenticate(@options, @options['imap-auth'], @options['imap-user'], @options['imap-password'], @options['imap-auth-preference']) imap.select(@options['imap-folder-clean']) if (@options['imap-folder-clean']) # only for check imap.select(@options['imap-folder-spam']) if (@options['imap-folder-spam']) # only for check command_line_args.each do |mailbox| uids = imap_get_target_uids(imap, mailbox) uids.each do |uid| imapm = IMAPMessage.new(@options, imap, uid) imapm.fetch_rfc822 token_db = tokenize_buf(imapm.buf) token_db.filename = uid @db_hash[token_db.language].get_combined_probability(token_db) token_dbs.push(token_db) if (@options['verbose']) @options['message-fh'].printf("combined probability %s %d %f\n", mailbox, imapm.seqno, token_db.probability) end target_mailbox = mailbox if token_db.spam_flag target_mailbox = @options['imap-folder-spam'] if (@options['imap-folder-spam']) ret_code = CODE_SPAM elsif (@options['imap-folder-clean']) target_mailbox = @options['imap-folder-clean'] end updated = imapm.insert_rfc822_headers!(token_db.spam_flag, token_db.probability) if updated imapm.reset_seen_flag if (@options['imap-reset-seen-flag']) imapm.append(target_mailbox) imapm.set_delete_flag elsif (target_mailbox != mailbox) imapm.reset_seen_flag if (@options['imap-reset-seen-flag']) imapm.copy(target_mailbox) imapm.set_delete_flag end end imap.close end imap.logout return ret_code end def do_export(command_line_args) file = if command_line_args.empty? '-' else command_line_args[0] end if (@options['export-clean']) open_wo(file) do |fh| @options['languages'].each do |lang| @db_hash[lang].clean.open('r') @db_hash[lang].clean.export(fh) if @db_hash[lang].clean.file_count.positive? @db_hash[lang].clean.close end end end if (@options['export-spam']) open_wo(file) do |fh| @options['languages'].each do |lang| @db_hash[lang].spam.open('r') @db_hash[lang].spam.export(fh) if @db_hash[lang].spam.file_count.positive? @db_hash[lang].spam.close end end end return unless (@options['export-probability']) open_wo(file) do |fh| @options['languages'].each do |lang| @db_hash[lang].prob.open('r') @db_hash[lang].prob.export(fh) if @db_hash[lang].prob.file_count.positive? @db_hash[lang].prob.close end end end def setup_imap Net::IMAP.class_eval < true } parser.quiet = true begin parser.each_option do |n, arg| name = n.sub(/^--/, '') if (options[name] && allow_multi[name]) options[name] += (',' + arg) else options[name] = arg.dup end end rescue soft_raise(format("#{$PROGRAM_NAME}: %s", parser.error_message)) end return options end def get_options argv_backup = Marshal.load(Marshal.dump(ARGV)) # shallow copy is enough? options = parse_command_line if (options['config-file'] && !File.file?(options['config-file'])) soft_raise(format("#{$PROGRAM_NAME}: can't open config file `%s'. check argument of --config-file\n", options['config-file'])) end unless (options['homedir']) options['homedir'] = if (ENV['BSFILTERHOME']) ENV['BSFILTERHOME'] elsif (ENV['HOME']) ENV['HOME'] + '/' + Default_homedir elsif defined?(ExerbRuntime) File.dirname(ExerbRuntime.filepath) else File.dirname($PROGRAM_NAME) end end options['config-file'] = options['homedir'] + '/' + Default_conf_file unless (options['config-file']) if (options['config-file'] && File.file?(options['config-file'])) ARGV.clear argv_config = read_config_file(options['config-file']) (argv_config + argv_backup).reverse.each do |argv| ARGV.unshift(argv) end options.update(parse_command_line) end if (options['help']) usage exit 0 end if (options['revision']) print("bsfilter release #{Release} revision #{Revision}\n") exit 0 end options['homedir'] = options['homedir'].sub(%r{/*$}, '') + '/' if (options['method']) if (options['method'] !~ /\A(g|r|rf)\z/) soft_raise(format("#{$PROGRAM_NAME}: unsupported method `%s' for --method or -m\n", options['method'])) end else options['method'] = Default_method end options['header-prefix'] = Default_header_prefix unless (options['header-prefix']) options['spam-subject-prefix'] = Default_spam_subject_prefix unless (options['spam-subject-prefix']) options['db'] = Default_db unless (options['db']) case options['db'] when 'ndbm' require 'dbm' when 'sdbm' require 'sdbm' when 'gdbm' require 'gdbm' when 'bdb1' require 'bdb1' when 'bdb' require 'bdb' when 'qdbm' require 'depot' else soft_raise(format("#{$PROGRAM_NAME}: unsupported argument `%s' for --db\n", options['db'])) end if (options['jtokenizer']) options['jtokenizer'].downcase! else options['jtokenizer'] = Default_jtokenizer end case options['jtokenizer'] when 'bigram' when 'block' when 'mecab' require 'MeCab' when 'chasen' require 'chasen.o' when 'kakasi' require 'kakasi' else soft_raise(format("#{$PROGRAM_NAME}: unsupported argument `%s' for --jtokenizer or -j\n", options['jtokenizer'])) end @jtokenizer = Jtokenizer.new(options['jtokenizer']) ## if (options["unified-db"]) ## options["languages"] = [Default_Language] ## else ## options["languages"] = Languages ## end options['languages'] = Languages options['mark-in-token'] = Default_mark_in_token unless (options['mark-in-token']) options['mark-in-token'] = options['mark-in-token'].gsub(/\s/, '') options['max-line'] = (options['max-line'] || Default_max_line).to_i options['max-mail'] = (options['max-mail'] || Default_max_mail).to_i options['min-mail'] = (options['min-mail'] || Default_min_mail).to_i options['degeneration'] = options['disable-degeneration'] ? false : true array = if (options['refer-header']) options['refer-header'].downcase.split(',') elsif (options['ignore-header']) [] else Default_refer_header.downcase.split(',') end options['refer-header'] = {} array.each do |header| options['refer-header'][header] = true end options['use-body'] = options['ignore-body'] ? false : true options['pid-file'] = options['homedir'] + Default_pid_file unless (options['pid-file']) options['imap-auth'] = options['imap-auth'] || Default_imap_auth options['imap-auth-preference'] = Default_imap_auth_preference # can't modify with command line option options['utf-8'] = if ((!options['disable-utf-8'])) true else false end if (options['pop']) check_options_for_pop!(options) require 'timeout' require 'socket' setup_socket_timeout end if (options['imap']) check_options_for_imap!(options) require 'net/imap' setup_imap end if (options['ssl']) if (options['ssl-cert']) && !File.readable?(options['ssl-cert']) soft_raise(format("#{$PROGRAM_NAME}: can't read %s. check --ssl-cert option", options['ssl-cert'])) end require 'openssl' setup_ssl_socket_timeout end return options end def show_db_status @options['languages'].each do |lang| @db_hash[lang].clean.open('r') @db_hash[lang].spam.open('r') @db_hash[lang].prob.open('r') @options['message-fh'].printf("db %s %d %d %d %d %d\n", lang, @db_hash[lang].clean.size, @db_hash[lang].clean.file_count, @db_hash[lang].spam.size, @db_hash[lang].spam.file_count, @db_hash[lang].prob.size) @db_hash[lang].prob.close @db_hash[lang].spam.close @db_hash[lang].clean.close end end def show_process(token_db, maintenance_command) if (@options['pop']) prot = 'pop' elsif (@options['imap']) prot = "imap" else prot = "file" end case token_db.spam_flag when nil filter_result = '-' when true filter_result = 'spam' when false filter_result = 'clean' else raise 'internal error: unknown spam_flag' end @options['message-fh'].printf("%s %s %s %s %s %s %s\n", prot, token_db.language, filter_result, maintenance_command, token_db.time.strftime('%Y%m%d%H%M%S'), token_db.message_id, token_db.filename) end def spam? @token_dbs.last.spam_flag end def probability @token_dbs.last.probability end def setup(command_line_options) @options.clear @db_hash.clear command_line_options_backup = command_line_options.dup argv_backup = ARGV.dup ARGV.clear ARGV.unshift(*command_line_options_backup) unless command_line_options_backup.empty? @options.update(get_options) $stdin.binmode if (@options['quiet']) @options['message-fh'] = DevNull.new @options['pipe-fh'] = DevNull.new elsif (((@options['export-clean'] || @options['export-spam'] || @options['export-probability']) && (ARGV.empty? || (ARGV[0] == '-'))) || # export to stdout @options['list-clean'] || @options['list-spam'] || @options['pipe']) @options['message-fh'] = $stderr @options['pipe-fh'] = $stdout $stdout.binmode else @options['message-fh'] = $stdout @options['pipe-fh'] = $stdout # keep STDOUT in text mode @options['message-fh'].sync = true end @options['mark-in-token'] = Regexp.quote(@options['mark-in-token']) init_dir(@options['homedir']) @options['languages'].each do |lang| case @options['method'] when 'rf' @db_hash[lang] = RobinsonFisher.new(@options, lang) when 'r' @db_hash[lang] = Robinson.new(@options, lang) when 'g' @db_hash[lang] = Graham.new(@options, lang) else raise format('internal error: unknown method %s', @options['method']) end @db_hash[lang].spam_cutoff = @options['spam-cutoff'].to_f if (@options['spam-cutoff']) end rest_options = ARGV.dup ARGV.clear ARGV.unshift(*argv_backup) unless argv_backup.empty? return rest_options end def run(command_line_args) @options['message-fh'].print('start ', Time.new.to_s, "\n") if (@options['verbose']) if (@options['show-db-status']) show_db_status return EXIT_NORMAL end if (@options['pop']) write_pid_file(@options['pid-file']) do_pop File.unlink(@options['pid-file']) return EXIT_NORMAL end filtering_mode = true token_dbs = [] @token_dbs = token_dbs if (@options['import-clean'] || @options['import-spam'] || @options['add-clean'] || @options['add-spam'] || @options['sub-clean'] || @options['sub-spam']) filtering_mode = false if (command_line_args.empty? && ! @options['imap']) token_dbs = update_token_dbs(['-']) else token_dbs = update_token_dbs(command_line_args) end end if (@options['export-clean'] || @options['export-spam'] || @options['export-probability']) filtering_mode = false do_export(command_line_args) end if (@options['update']) filtering_mode = false @options['languages'].each do |lang| @db_hash[lang].clean.open('r') @db_hash[lang].spam.open('r') @db_hash[lang].update_probability(token_dbs) # dbs = Array of TokenDB for -c, -s @db_hash[lang].clean.close @db_hash[lang].spam.close end end ret_code = CODE_NORMAL if filtering_mode @options['languages'].each do |lang| @db_hash[lang].prob.open('r') end if (@options['imap']) ret_code = do_imap(command_line_args, token_dbs) else command_line_args = ['-'] if command_line_args.empty? ret_code = CODE_CLEAN unless (@options['pipe']) command_line_args.each do |file| open_ro(file) do |fh| number = 1 mbox = Mbox.new(@options, fh) while (buf = mbox.read) token_db = tokenize_buf(buf) token_db.filename = file @db_hash[token_db.language].get_combined_probability(token_db) insert_headers!(buf, token_db.spam_flag, token_db.probability) @options['pipe-fh'].print buf.join if (@options['pipe']) printf("%s\n", file) if (token_db.spam_flag && @options['list-spam']) printf("%s\n", file) if (!token_db.spam_flag && @options['list-clean']) ret_code = CODE_SPAM if (token_db.spam_flag && (!@options['pipe'])) token_dbs.push(token_db) if defined?(fh.path) @options['message-fh'].printf("combined probability %s %d %f\n", fh.path, number, token_db.probability) end number += 1 end end end end @options['languages'].each do |lang| @db_hash[lang].prob.close end $stdout.flush if (@options['auto-update']) auto_update(token_dbs) elsif (@options['show-process']) token_dbs.each do |token_db| show_process(token_db, '-') end end end @options['message-fh'].print('end ', Time.new.to_s, "\n") if (@options['verbose']) return ret_code end end class String def to_utf8 if (Bsfilter::LOG_CODESET) return dup.encode(Bsfilter::LOG_CODESET, Encoding::EUC_JP, undef: :replace, invalid: :replace) else self end end def validate_encoding self.encode(self.encoding, self.encoding, undef: :replace, invalid: :replace) end end if ($PROGRAM_NAME == __FILE__) bsfilter = Bsfilter.new args = bsfilter.setup(ARGV) if bsfilter.run(args) exit 0 else exit 1 end end nbkenichi-bsfilter-f0a5a7c/test/000077500000000000000000000000001465373635000167375ustar00rootroot00000000000000nbkenichi-bsfilter-f0a5a7c/test/test.rb000066400000000000000000000725741465373635000202620ustar00rootroot00000000000000# encoding: utf-8 # -*-Ruby-*- load '../src/bsfilter.rb' require 'test/unit' require 'fileutils' $default_options = ["--homedir", ".", "-v", "-d", "--show-process"] class DummyFH def initialize @buf = Array::new end attr_accessor :buf def sync=(*args) end def print(*args) @buf.push(*args.flatten.dup) @buf.map{|str| str.dup.force_encoding('ASCII-8BIT')} @buf = @buf.join.split(/(\r\n|\r|\n)/).each_slice(2).to_a.map{|s| s.join} end def printf(format, *args) @buf.push(sprintf(format, *args)) end def puts(*args) @buf.push(*args.flatten.dup) end end class Bsfilter attr_accessor :options def use_dummyfh options["message-fh"] = DummyFH::new options["pipe-fh"] = DummyFH::new end def grep_message(pattern) options["message-fh"].buf.map{|str| str.force_encoding('UTF-8')}.grep(pattern) end def count_message(pattern) grep_message(pattern).length end def grep_pipe(pattern) options["pipe-fh"].buf.map{|str| str.force_encoding('UTF-8')}.grep(pattern) end def count_pipe(pattern) grep_pipe(pattern).length end end def safe_require(file) begin require file return true rescue LoadError return false end end def unlink_all unlink_prob_ndbm({:force => true}) unlink_token_ndbm({:force => true}) unlink_prob_sdbm({:force => true}) unlink_token_sdbm({:force => true}) unlink_prob_gdbm({:force => true}) unlink_token_gdbm({:force => true}) unlink_prob_bdb1({:force => true}) unlink_token_bdb1({:force => true}) unlink_prob_bdb({:force => true}) unlink_token_bdb({:force => true}) unlink_prob_qdbm({:force => true}) unlink_token_qdbm({:force => true}) end def unlink_prob_ndbm(options = {}) FileUtils.rm(["C.prob.ndbm.db", "C.prob.ndbm.lock", "ja.prob.ndbm.db", "ja.prob.ndbm.lock"], **options) end def unlink_token_ndbm(options = {}) FileUtils.rm(["C.clean.ndbm.db", "C.clean.ndbm.lock", "C.spam.ndbm.db", "C.spam.ndbm.lock", "ja.clean.ndbm.db", "ja.clean.ndbm.lock", "ja.spam.ndbm.db", "ja.spam.ndbm.lock"], **options) end def unlink_prob_sdbm(options = {}) FileUtils.rm(["C.prob.sdbm.dir", "C.prob.sdbm.pag", "C.prob.sdbm.lock", "ja.prob.sdbm.dir", "ja.prob.sdbm.pag", "ja.prob.sdbm.lock"], **options) end def unlink_token_sdbm(options = {}) FileUtils.rm(["C.clean.sdbm.dir", "C.clean.sdbm.pag", "C.clean.sdbm.lock", "C.spam.sdbm.dir", "C.spam.sdbm.pag", "C.spam.sdbm.lock", "ja.clean.sdbm.dir", "ja.clean.sdbm.pag", "ja.clean.sdbm.lock", "ja.spam.sdbm.dir", "ja.spam.sdbm.pag", "ja.spam.sdbm.lock"], **options) end def unlink_prob_gdbm(options = {}) FileUtils.rm(["C.prob.gdbm", "C.prob.gdbm.lock", "ja.prob.gdbm", "ja.prob.gdbm.lock"], **options) end def unlink_token_gdbm(options = {}) FileUtils.rm(["C.clean.gdbm", "C.clean.gdbm.lock", "ja.clean.gdbm", "ja.clean.gdbm.lock", "C.spam.gdbm", "C.spam.gdbm.lock", "ja.spam.gdbm", "ja.spam.gdbm.lock"], **options) end def unlink_prob_bdb1(options = {}) FileUtils.rm(["C.prob.bdb1", "C.prob.bdb1.lock", "ja.prob.bdb1", "ja.prob.bdb1.lock"], **options) end def unlink_token_bdb1(options = {}) FileUtils.rm(["C.clean.bdb1", "C.clean.bdb1.lock", "ja.clean.bdb1", "ja.clean.bdb1.lock", "C.spam.bdb1", "C.spam.bdb1.lock", "ja.spam.bdb1", "ja.spam.bdb1.lock"], **options) end def unlink_prob_bdb(options = {}) FileUtils.rm(["C.prob.bdb", "C.prob.bdb.lock", "ja.prob.bdb", "ja.prob.bdb.lock"], **options) end def unlink_token_bdb(options = {}) FileUtils.rm(["C.clean.bdb", "C.clean.bdb.lock", "ja.clean.bdb", "ja.clean.bdb.lock", "C.spam.bdb", "C.spam.bdb.lock", "ja.spam.bdb", "ja.spam.bdb.lock"], **options) end def unlink_prob_qdbm(options = {}) FileUtils.rm(["C.prob.qdbm", "C.prob.qdbm.lock", "ja.prob.qdbm", "ja.prob.qdbm.lock"], **options) end def unlink_token_qdbm(options = {}) FileUtils.rm(["C.clean.qdbm", "C.clean.qdbm.lock", "ja.clean.qdbm", "ja.clean.qdbm.lock", "C.spam.qdbm", "C.spam.qdbm.lock", "ja.spam.qdbm", "ja.spam.qdbm.lock"], **options) end class TestMultipleInstances < Test::Unit::TestCase def test_by_mbox @files = ["testcases/mbox"] @bsfilter0 = Bsfilter::new @bsfilter0.setup($default_options + ["--mbox"]) @bsfilter0.use_dummyfh @bsfilter1 = Bsfilter::new @bsfilter1.setup($default_options) @bsfilter1.use_dummyfh @bsfilter2 = Bsfilter::new @bsfilter2.setup($default_options + ["--mbox"]) @bsfilter2.use_dummyfh @bsfilter3 = Bsfilter::new @bsfilter3.setup($default_options) @bsfilter3.use_dummyfh @bsfilter0.run(@files) @bsfilter1.run(@files) @bsfilter2.run(@files) @bsfilter3.run(@files) assert_equal(3, @bsfilter0.count_message(/^file/), "@bsfilter0") assert_equal(1, @bsfilter1.count_message(/^file/), "@bsfilter1") assert_equal(3, @bsfilter2.count_message(/^file/), "@bsfilter2") assert_equal(1, @bsfilter3.count_message(/^file/), "@bsfilter3") end def test_by_jtokenizer return if (! safe_require('MeCab')) return if (! safe_require('chasen.o')) @files = ["testcases/iso_2022_jp_plain"] @bsfilter0 = Bsfilter::new @bsfilter0.setup($default_options + ["--jtokenizer", "bigram"]) @bsfilter0.use_dummyfh @bsfilter1 = Bsfilter::new @bsfilter1.setup($default_options + ["--jtokenizer", "mecab"]) @bsfilter1.use_dummyfh @bsfilter2 = Bsfilter::new @bsfilter2.setup($default_options + ["--jtokenizer", "bigram"]) @bsfilter2.use_dummyfh @bsfilter3 = Bsfilter::new @bsfilter3.setup($default_options + ["--jtokenizer", "chasen"]) @bsfilter3.use_dummyfh @bsfilter0.run(@files) @bsfilter1.run(@files) @bsfilter2.run(@files) @bsfilter3.run(@files) assert_equal(1, @bsfilter0.count_message(/tokenizer ja body 朝顔/), "@bsfilter0 2letters") assert_equal(0, @bsfilter0.count_message(/tokenizer ja body 向日葵/), "@bsfilter0 3letters") assert_equal(1, @bsfilter1.count_message(/tokenizer ja body 朝顔/), "@bsfilter1 2letters") assert_equal(1, @bsfilter1.count_message(/tokenizer ja body 向日葵/), "@bsfilter1 3letters") assert_equal(1, @bsfilter2.count_message(/tokenizer ja body 朝顔/), "@bsfilter2 2letters") assert_equal(0, @bsfilter2.count_message(/tokenizer ja body 向日葵/), "@bsfilter2 3letters") assert_equal(1, @bsfilter3.count_message(/tokenizer ja body 朝顔/), "@bsfilter3 2letters") assert_equal(1, @bsfilter3.count_message(/tokenizer ja body 向日葵/), "@bsfilter3 3letters") end def teardown unlink_all end end class TestGetLang < Test::Unit::TestCase def setup @bsfilter = Bsfilter::new @bsfilter.setup($default_options) @bsfilter.use_dummyfh end def test_euc @files = ["testcases/euc_plain_iso_2022_jp"] @bsfilter.run(@files) assert_equal(1, @bsfilter.count_message(/lang ja euc/)) end def test_sjis @files = ["testcases/sjis_plain_iso_2022_jp"] @bsfilter.run(@files) assert_equal(1, @bsfilter.count_message(/lang ja sjis/)) end def test_sjis_base64_iso_2022_jp @files = ["testcases/sjis_base64_iso_2022_jp"] @bsfilter.run(@files) assert_equal(1, @bsfilter.count_message(/lang ja sjis/)) end def test_sjis_base64_iso_2202_jp_typo @files = ["testcases/sjis_base64_iso_2202_jp_typo"] @bsfilter.run(@files) assert_equal(1, @bsfilter.count_message(/lang ja sjis/)) end def test_iso_2022_jp_plain @files = ["testcases/iso_2022_jp_plain"] @bsfilter.run(@files) assert_equal(1, @bsfilter.count_message(/lang ja jis/)) end def test_utf8_base64 @files = ["testcases/utf8_base64"] @bsfilter.run(@files) assert_equal(1, @bsfilter.count_message(/lang ja utf8/)) end def test_utf8_plain @files = ["testcases/utf8_plain"] @bsfilter.run(@files) assert_equal(1, @bsfilter.count_message(/lang ja utf8/)) end def test_gb18030_base64_gb2312 @files = ["testcases/gb18030_base64_gb2312"] @bsfilter.run(@files) assert_equal(1, @bsfilter.count_message(/lang ja gb18030/)) end def teardown unlink_prob_sdbm end end class TestJtokenizer < Test::Unit::TestCase def setup @files = ["testcases/iso_2022_jp_plain"] @bsfilter = Bsfilter::new end def test_bigram @bsfilter.setup($default_options + ["--jtokenizer", "bigram"]) @bsfilter.use_dummyfh @bsfilter.run(@files) assert_equal(1, @bsfilter.count_message(/tokenizer ja body 朝顔/), "2 letters") assert_equal(0, @bsfilter.count_message(/tokenizer ja body 向日葵/), "3 letters") end def test_mecab assert_nothing_raised('Warning: ignore this test if MeCab is NOT installed') do @bsfilter.setup($default_options + ["--jtokenizer", "mecab"]) end @bsfilter.use_dummyfh @bsfilter.run(@files) assert_equal(1, @bsfilter.count_message(/tokenizer ja body 朝顔/), "2 letters") assert_equal(1, @bsfilter.count_message(/tokenizer ja body 向日葵/), "3 letters") end def test_chasen assert_nothing_raised('Warning: ignore this test if chasen is NOT installed') do @bsfilter.setup($default_options + ["--jtokenizer", "chasen"]) end @bsfilter.use_dummyfh @bsfilter.run(@files) assert_equal(1, @bsfilter.count_message(/tokenizer ja body 朝顔/), "2 letters") assert_equal(1, @bsfilter.count_message(/tokenizer ja body 向日葵/), "3 letters") end def teardown unlink_all end end class TestBase64 < Test::Unit::TestCase def setup @bsfilter = Bsfilter::new @bsfilter.setup($default_options) @bsfilter.use_dummyfh end def test_delimiter_bug @files = ["testcases/mime_delimiter_bug"] @bsfilter.run(@files) assert_equal(1, @bsfilter.count_message(/tokenizer ja body 朝顔/), "japanese") end def test_base64 @files = ["testcases/sjis_base64_iso_2022_jp"] @bsfilter.run(@files) assert_equal(1, @bsfilter.count_message(/tokenizer ja body 朝顔/), "japanese") end end class TestPlainTextParser < Test::Unit::TestCase def setup @bsfilter = Bsfilter::new @bsfilter.setup($default_options) @bsfilter.use_dummyfh end def test_folding @files = ["testcases/folding"] @bsfilter.run(@files) assert_equal(1, @bsfilter.count_message(/tokenizer ja body 朝顔/), "japanese") assert_equal(0, @bsfilter.count_message(/headtail/), "english") end def test_iso_8895_1 @files = ["testcases/iso_8895_1_plain"] @bsfilter.run(@files) assert_equal(1, @bsfilter.count_message(/tokenizer subject replIca/), "replIca") assert_equal(1, @bsfilter.count_message(/tokenizer.*elegant/), "elegant") end def teardown unlink_prob_sdbm end end class TestDBM < Test::Unit::TestCase def setup unlink_all @files = ["testcases/iso_2022_jp_plain", "testcases/ascii_plain"] @bsfilter = Bsfilter::new end def test_default_dbm @bsfilter.setup($default_options + ["-c"]) @bsfilter.use_dummyfh @bsfilter.run(@files) assert(File::readable?("ja.clean.sdbm.dir"), "ja.clean.sdbm.dir") assert(File::readable?("C.clean.sdbm.dir"), "C.clean.sdbm.dir") @bsfilter.setup($default_options + ["-u"]) @bsfilter.use_dummyfh @bsfilter.run([]) assert(File::readable?("ja.prob.sdbm.dir"), "ja.prob.sdbm.dir") assert(File::readable?("C.prob.sdbm.dir"), "C.prob.sdbm.dir") unlink_token_sdbm unlink_prob_sdbm end def test_ndbm @bsfilter.setup($default_options + ["--db", "ndbm", "-c"]) @bsfilter.use_dummyfh @bsfilter.run(@files) assert(File::readable?("ja.clean.ndbm.db"), "ja.clean.ndbm.db") assert(File::readable?("C.clean.ndbm.db"), "C.clean.ndbm.db") @bsfilter.setup($default_options + ["--db", "ndbm", "-u"]) @bsfilter.use_dummyfh @bsfilter.run([]) assert(File::readable?("ja.prob.ndbm.db"), "ja.prob.ndbm.db") assert(File::readable?("C.prob.ndbm.db"), "C.prob.ndbm.db") unlink_token_ndbm unlink_prob_ndbm end def test_sdbm @bsfilter.setup($default_options + ["--db", "sdbm", "-c"]) @bsfilter.use_dummyfh @bsfilter.run(@files) assert(File::readable?("ja.clean.sdbm.dir"), "ja.clean.sdbm.dir") assert(File::readable?("C.clean.sdbm.dir"), "C.clean.sdbm.dir") @bsfilter.setup($default_options + ["--db", "sdbm", "-u"]) @bsfilter.use_dummyfh @bsfilter.run([]) assert(File::readable?("ja.prob.sdbm.dir"), "ja.prob.sdbm.dir") assert(File::readable?("C.prob.sdbm.dir"), "C.prob.sdbm.dir") unlink_token_sdbm unlink_prob_sdbm end def test_gdbm assert_nothing_raised('Warning: ignore this test if GDBM is NOT installed') do @bsfilter.setup($default_options + ["--db", "gdbm", "-c"]) end @bsfilter.use_dummyfh @bsfilter.run(@files) assert(File::readable?("ja.clean.gdbm"), "ja.clean.gdbm") assert(File::readable?("C.clean.gdbm"), "C.clean.gdbm") @bsfilter.setup($default_options + ["--db", "gdbm", "-u"]) @bsfilter.use_dummyfh @bsfilter.run([]) assert(File::readable?("ja.prob.gdbm"), "ja.prob.gdbm") assert(File::readable?("C.prob.gdbm"), "C.prob.gdbm") unlink_token_gdbm unlink_prob_gdbm end def test_bdb1 assert_nothing_raised('Warning: ignore this test if BDB1 is NOT installed') do @bsfilter.setup($default_options + ["--db", "bdb1", "-c"]) end @bsfilter.use_dummyfh @bsfilter.run(@files) assert(File::readable?("ja.clean.bdb1"), "ja.clean.bdb1") assert(File::readable?("C.clean.bdb1"), "C.clean.bdb1") @bsfilter.setup($default_options + ["--db", "bdb1", "-u"]) @bsfilter.use_dummyfh @bsfilter.run([]) assert(File::readable?("ja.prob.bdb1"), "ja.prob.bdb1") assert(File::readable?("C.prob.bdb1"), "C.prob.bdb1") unlink_token_bdb1 unlink_prob_bdb1 end def test_bdb assert_nothing_raised('Warning: ignore this test if BDB is NOT installed') do @bsfilter.setup($default_options + ["--db", "bdb", "-c"]) end @bsfilter.use_dummyfh @bsfilter.run(@files) assert(File::readable?("ja.clean.bdb"), "ja.clean.bdb") assert(File::readable?("C.clean.bdb"), "C.clean.bdb") @bsfilter.setup($default_options + ["--db", "bdb", "-u"]) @bsfilter.use_dummyfh @bsfilter.run([]) assert(File::readable?("ja.prob.bdb"), "ja.prob.bdb") assert(File::readable?("C.prob.bdb"), "C.prob.bdb") unlink_token_bdb unlink_prob_bdb end def test_qdbm assert_nothing_raised('Warning: ignore this test if QDBM is NOT installed') do @bsfilter.setup($default_options + ["--db", "qdbm", "-c"]) end @bsfilter.use_dummyfh @bsfilter.run(@files) assert(File::readable?("ja.clean.qdbm"), "ja.clean.qdbm") assert(File::readable?("C.clean.qdbm"), "C.clean.qdbm") @bsfilter.setup($default_options + ["--db", "qdbm", "-u"]) @bsfilter.use_dummyfh @bsfilter.run([]) assert(File::readable?("ja.prob.qdbm"), "ja.prob.qdbm") assert(File::readable?("C.prob.qdbm"), "C.prob.qdbm") unlink_token_qdbm unlink_prob_qdbm end end class TestMbox < Test::Unit::TestCase def setup @files = ["testcases/mbox"] @bsfilter = Bsfilter::new end def test_with_mbox @bsfilter.setup($default_options + ["--mbox"]) @bsfilter.use_dummyfh @bsfilter.run(@files) assert_equal(3, @bsfilter.count_message(/^file/)) end def test_without_mbox @bsfilter.setup($default_options) @bsfilter.use_dummyfh @bsfilter.run(@files) assert_equal(1, @bsfilter.count_message(/file/)) end def teardown unlink_prob_sdbm end end class TestHeaderParser < Test::Unit::TestCase def setup @bsfilter = Bsfilter::new end def test_header_parser @files = ["testcases/header"] @bsfilter.setup($default_options) @bsfilter.use_dummyfh @bsfilter.run(@files) assert_equal(1, @bsfilter.count_message(/^tokenizer received host1/), "^tokenizer received host1") assert_equal(1, @bsfilter.count_message(/^tokenizer received host2/), "^tokenizer received host2") assert_equal(0, @bsfilter.count_message(/^tokenizer received host3/), "^tokenizer received host3") # drop 2nd hop assert_equal(0, @bsfilter.count_message(/abcdefgh/), "abcdefgh") # drop ID assert_equal(1, @bsfilter.count_message(/^tokenizer subject/), "^tokenizer subject") # refer subject assert_equal(0, @bsfilter.count_message(/^tokenizer date/), "^tokenizer date") # ignore date end def test_ignore_header @files = ["testcases/header"] @bsfilter.setup($default_options + ["--ignore-header"]) @bsfilter.use_dummyfh @bsfilter.run(@files) assert_equal(0, @bsfilter.count_message(/^tokenizer received host1/), "^tokenizer received host1") assert_equal(0, @bsfilter.count_message(/^tokenizer received host2/), "^tokenizer received host2") assert_equal(0, @bsfilter.count_message(/^tokenizer received host3/), "^tokenizer received host3") assert_equal(0, @bsfilter.count_message(/abcdefgh/), "abcdefgh") assert_equal(0, @bsfilter.count_message(/^tokenizer subject/), "^tokenizer subject") assert_equal(0, @bsfilter.count_message(/^tokenizer date/), "^tokenizer date") end def test_refer_header_null @files = ["testcases/header"] @bsfilter.setup($default_options + ["--refer-header", ""]) @bsfilter.use_dummyfh @bsfilter.run(@files) assert_equal(0, @bsfilter.count_message(/^tokenizer received host1/), "^tokenizer received host1") assert_equal(0, @bsfilter.count_message(/^tokenizer received host2/), "^tokenizer received host2") assert_equal(0, @bsfilter.count_message(/^tokenizer received host3/), "^tokenizer received host3") assert_equal(0, @bsfilter.count_message(/abcdefgh/), "abcdefgh") assert_equal(0, @bsfilter.count_message(/^tokenizer subject/), "^tokenizer subject") assert_equal(0, @bsfilter.count_message(/^tokenizer date/), "^tokenizer date") end def test_refer_header_subject @files = ["testcases/header"] @bsfilter.setup($default_options + ["--refer-header", "subject"]) @bsfilter.use_dummyfh @bsfilter.run(@files) assert_equal(0, @bsfilter.count_message(/^tokenizer received host1/), "^tokenizer received host1") assert_equal(0, @bsfilter.count_message(/^tokenizer received host2/), "^tokenizer received host2") assert_equal(0, @bsfilter.count_message(/^tokenizer received host3/), "^tokenizer received host3") assert_equal(0, @bsfilter.count_message(/abcdefgh/), "abcdefgh") assert_equal(1, @bsfilter.count_message(/^tokenizer subject/), "^tokenizer subject") assert_equal(0, @bsfilter.count_message(/^tokenizer date/), "^tokenizer date") end def test_refer_header_date @files = ["testcases/header"] @bsfilter.setup($default_options + ["--refer-header", "date"]) @bsfilter.use_dummyfh @bsfilter.run(@files) assert_equal(0, @bsfilter.count_message(/^tokenizer received host1/), "^tokenizer received host1") assert_equal(0, @bsfilter.count_message(/^tokenizer received host2/), "^tokenizer received host2") assert_equal(0, @bsfilter.count_message(/^tokenizer received host3/), "^tokenizer received host3") assert_equal(0, @bsfilter.count_message(/abcdefgh/), "abcdefgh") assert_equal(0, @bsfilter.count_message(/^tokenizer subject/), "^tokenizer subject") assert_equal(9, @bsfilter.count_message(/^tokenizer date/), "^tokenizer date") # date header has 9 tokens end def test_refer_header_subject_date @files = ["testcases/header"] @bsfilter.setup($default_options + ["--refer-header", "subject,date"]) @bsfilter.use_dummyfh @bsfilter.run(@files) assert_equal(0, @bsfilter.count_message(/^tokenizer received host1/), "^tokenizer received host1") assert_equal(0, @bsfilter.count_message(/^tokenizer received host2/), "^tokenizer received host2") assert_equal(0, @bsfilter.count_message(/^tokenizer received host3/), "^tokenizer received host3") assert_equal(0, @bsfilter.count_message(/abcdefgh/), "abcdefgh") assert_equal(1, @bsfilter.count_message(/^tokenizer subject/), "^tokenizer subject") assert_equal(9, @bsfilter.count_message(/^tokenizer date/), "^tokenizer date") end def test_mime_q_iso_2022_jp @files = ["testcases/mime_q_iso_2022_jp"] @bsfilter.setup($default_options) @bsfilter.use_dummyfh @bsfilter.run(@files) assert_equal(1, @bsfilter.count_message(/^tokenizer from 匿名/), "^tokenizer subject") assert_equal(1, @bsfilter.count_message(/^tokenizer subject word/), "^tokenizer subject") assert_equal(1, @bsfilter.count_message(/^tokenizer subject 特別/), "^tokenizer subject") end def test_mime_b_iso_2022_jp @files = ["testcases/mime_b_iso_2022_jp"] @bsfilter.setup($default_options) @bsfilter.use_dummyfh @bsfilter.run(@files) assert_equal(1, @bsfilter.count_message(/^tokenizer subject 花見/), "^tokenizer subject") assert_equal(2, @bsfilter.count_message(/^tokenizer subject 猛暑/), "^tokenizer subject") end def test_mime_b_iso_2202_jp_typo @files = ["testcases/mime_b_iso_2202_jp_typo"] @bsfilter.setup($default_options) @bsfilter.use_dummyfh @bsfilter.run(@files) assert_equal(1, @bsfilter.count_message(/^tokenizer subject 花見/), "^tokenizer subject") end def test_mime_b_shift_jis @files = ["testcases/mime_b_shift_jis"] @bsfilter.setup($default_options) @bsfilter.use_dummyfh @bsfilter.run(@files) assert_equal(1, @bsfilter.count_message(/^tokenizer subject 花見/), "^tokenizer subject") end def test_mime_b_shift_jis_bad @files = ["testcases/mime_b_shift_jis_bad"] @bsfilter.setup($default_options) @bsfilter.use_dummyfh @bsfilter.run(@files) assert_equal(1, @bsfilter.count_message(/^tokenizer subject 花見/), "^tokenizer subject") end def teardown unlink_prob_sdbm end end class TestInsertHeader < Test::Unit::TestCase def setup @bsfilter = Bsfilter::new @bsfilter.setup($default_options + ["--pipe", "--insert-revision"]) @bsfilter.use_dummyfh end def test_normal @files = ["testcases/ascii_plain"] @bsfilter.run(@files) assert_equal(16, @bsfilter.count_pipe(/\A[^\r\n]*\n\z/), '\n') assert_match(/^X-Spam-Revision:/, @bsfilter.options["pipe-fh"].buf[8]) end def test_no_body @files = ["testcases/no_body"] @bsfilter.run(@files) assert_equal(9, @bsfilter.count_pipe(/\A[^\r\n]*\n\z/), '\n') assert_match(/^X-Spam-Revision:/, @bsfilter.options["pipe-fh"].buf[8]) end def test_no_boundary @files = ["testcases/no_boundary"] @bsfilter.run(@files) assert_equal(15, @bsfilter.count_pipe(/\A[^\r\n]*\n\z/), '\n') assert_match(/^X-Spam-Revision:/, @bsfilter.options["pipe-fh"].buf[8]) end def teardown unlink_prob_sdbm end end class TestMarkSpamSubject < Test::Unit::TestCase def setup unlink_all @files = ["testcases/multi_subject", "testcases/no_body", "testcases/no_boundary"] @bsfilter = Bsfilter::new @bsfilter.setup($default_options + ["-s", "-u"]) @bsfilter.use_dummyfh @bsfilter.run(@files) @bsfilter.setup($default_options + ["--pipe", "--mark-spam-subject", "--insert-revision"]) @bsfilter.use_dummyfh end def test_multi_subject @files = ["testcases/multi_subject"] @bsfilter.run(@files) assert_equal(1, @bsfilter.count_pipe(/\ASubject: \[SPAM\] subject1/), "1st subject") assert_equal(1, @bsfilter.count_pipe(/\ASubject: \[SPAM\] subject2/), "2nd subject") end def test_no_body @files = ["testcases/no_body"] @bsfilter.run(@files) assert_equal(1, @bsfilter.count_pipe(/\ASubject: \[SPAM\]/), "no body, no subject") end def test_no_boundary @files = ["testcases/no_boundary"] @bsfilter.run(@files) assert_equal(1, @bsfilter.count_pipe(/\ASubject: \[SPAM\]/), "no boundary, no subject") end def teardown unlink_token_sdbm unlink_prob_sdbm end end class TestEOL < Test::Unit::TestCase def setup unlink_all @files = ["testcases/lf", "testcases/crlf", "testcases/cr"] @bsfilter = Bsfilter::new @bsfilter.setup($default_options + ["-s", "-u"]) @bsfilter.use_dummyfh @bsfilter.run(@files) @bsfilter.setup($default_options + ["--pipe", "--mark-spam-subject", "--insert-revision"]) @bsfilter.use_dummyfh end def test_lf @files = ["testcases/lf"] @bsfilter.run(@files) assert_equal(11, @bsfilter.count_pipe(/\A[^\r\n]*\n\z/), '\n') assert_equal(0, @bsfilter.count_pipe(/\A[^\r\n]*\r\z/), '\r') assert_equal(0, @bsfilter.count_pipe(/\A[^\r\n]*\r\n\z/), '\r\n') end def test_cr @files = ["testcases/cr"] @bsfilter.run(@files) assert_equal(0, @bsfilter.count_pipe(/\A[^\r\n]*\n\z/), '\n') assert_equal(11, @bsfilter.count_pipe(/\A[^\r\n]*\r\z/), '\r') assert_equal(0, @bsfilter.count_pipe(/\A[^\r\n]*\r\n\z/), '\r\n') end def test_crlf @files = ["testcases/crlf"] @bsfilter.run(@files) assert_equal(0, @bsfilter.count_pipe(/\A[^\r\n]*\n\z/), '\n') assert_equal(0, @bsfilter.count_pipe(/\A[^\r\n]*\r\z/), '\r') assert_equal(11, @bsfilter.count_pipe(/\A[^\r\n]*\r\n\z/), '\r\n') end def teardown unlink_token_sdbm unlink_prob_sdbm end end class TestEOLMBox < Test::Unit::TestCase def setup @bsfilter = Bsfilter::new @bsfilter.setup($default_options + ["--mbox", "--pipe", "--insert-revision"]) @bsfilter.use_dummyfh end def test_lf @files = ["testcases/mbox_lf"] @bsfilter.run(@files) assert_equal(23, @bsfilter.count_pipe(/\A[^\r\n]*\n\z/), '\n') assert_equal(0, @bsfilter.count_pipe(/\A[^\r\n]*\r\z/), '\r') assert_equal(0, @bsfilter.count_pipe(/\A[^\r\n]*\r\n\z/), '\r\n') end def test_cr @files = ["testcases/mbox_cr"] @bsfilter.run(@files) assert_equal(0, @bsfilter.count_pipe(/\A[^\r\n]*\n\z/), '\n') assert_equal(23, @bsfilter.count_pipe(/\A[^\r\n]*\r\z/), '\r') assert_equal(0, @bsfilter.count_pipe(/\A[^\r\n]*\r\n\z/), '\r\n') end def test_crlf @files = ["testcases/mbox_crlf"] @bsfilter.run(@files) assert_equal(0, @bsfilter.count_pipe(/\A[^\r\n]*\n\z/), '\n') assert_equal(0, @bsfilter.count_pipe(/\A[^\r\n]*\r\z/), '\r') assert_equal(23, @bsfilter.count_pipe(/\A[^\r\n]*\r\n\z/), '\r\n') end def teardown unlink_prob_sdbm end end class TestHtmlParser < Test::Unit::TestCase def setup @files = ["testcases/html"] @bsfilter = Bsfilter::new @bsfilter.setup($default_options + ["-c"]) @bsfilter.use_dummyfh @bsfilter.run(@files) @bsfilter.setup($default_options + ["-u"]) @bsfilter.use_dummyfh @bsfilter.run([]) end def test_default @bsfilter.setup($default_options) @bsfilter.use_dummyfh @bsfilter.run(@files) assert_equal(1, @bsfilter.count_message(/^word.*plain_text_part/), "^word.*plain_text_part") assert_equal(1, @bsfilter.count_message(/^tokenizer.*ABCDEF/), "ABCDEF") assert_equal(1, @bsfilter.count_message(/^tokenizer.*after_atag/), "after_atag") assert_equal(0, @bsfilter.count_message(/^tokenizer.*after_html/), "after_html") assert_equal(0, @bsfilter.count_message(/^tokenizer.*after_body/), "after_body") assert_equal(0, @bsfilter.count_message(/^tokenizer.*fontsize0/), "fontsize0") assert_equal(0, @bsfilter.count_message(/^tokenizer.*fontsize1/), "fontsize1") assert_equal(0, @bsfilter.count_message(/^tokenizer.*displaynone/), "displaynone") assert_equal(1, @bsfilter.count_message(/^tokenizer url 192\.168\.0\.1/), "192.168.0.1") assert_equal(1, @bsfilter.count_message(/^tokenizer url www/), "www") end def test_ignore_plain_text_part @bsfilter.setup($default_options + ["--ignore-plain-text-part"]) @bsfilter.use_dummyfh @bsfilter.run(@files) assert_equal(0, @bsfilter.count_message(/^word.*plain_text_part/)) end def test_ignore_after_last_atag @bsfilter.setup($default_options + ["--ignore-after-last-atag"]) @bsfilter.use_dummyfh @bsfilter.run(@files) assert_equal(0, @bsfilter.count_message(/^tokenizer.*after_atag/)) end def teardown unlink_all end end class TestTokenizerOptionCombination < Test::Unit::TestCase def setup @files = Dir.glob("testcases/[^C]*") @option_elements = ["--disable-utf-8", "--ignore-header", "--ignore-body", "--disable-degeneration", "--mark-in-token @", "--ignore-plain-text-part", "--ignore-after-last-atag"] end def test_option_combination i = 0 imax = 2 ** @option_elements.length while (i < imax) j = 0 option_array = Array::new while (j < @option_elements.length) if ((i >> j) % 2 == 1) option_array.concat(@option_elements[j].split) end j += 1 end bsfilter = Bsfilter::new bsfilter.setup($default_options + option_array + ["-q"]) bsfilter.run(@files) i += 1 end end def teardown unlink_prob_sdbm end end nbkenichi-bsfilter-f0a5a7c/test/testcases/000077500000000000000000000000001465373635000207355ustar00rootroot00000000000000nbkenichi-bsfilter-f0a5a7c/test/testcases/ascii_plain000066400000000000000000000004571465373635000231410ustar00rootroot00000000000000Date: Sat, 23 Jul 2005 00:24:41 +0900 (JST) Message-Id: <20050723.002441.78702756@example.com> To: user@example.com Subject: test From: user@example.com Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit quote head tail > quote head > tail | quote head | tail nbkenichi-bsfilter-f0a5a7c/test/testcases/cr000066400000000000000000000004151465373635000212640ustar00rootroot00000000000000Date: Sat, 23 Jul 2005 00:24:41 +0900 (JST) Message-Id: <20050723.002441.78702756@example.com> To: user@example.com Subject: test From: user@example.com Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Mac style. EOL is cr. nbkenichi-bsfilter-f0a5a7c/test/testcases/crlf000066400000000000000000000004311465373635000216040ustar00rootroot00000000000000Date: Sat, 23 Jul 2005 00:24:41 +0900 (JST) Message-Id: <20050723.002441.78702756@example.com> To: user@example.com Subject: test From: user@example.com Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit DOS style. EOL is crlf. nbkenichi-bsfilter-f0a5a7c/test/testcases/euc_plain_iso_2022_jp000066400000000000000000000006131465373635000246270ustar00rootroot00000000000000Date: Sat, 23 Jul 2005 00:24:41 +0900 (JST) Message-Id: <20050723.002441.78702756@example.com> To: user@example.com From: user@example.com Mime-Version: 1.0 Content-Type: Text/Plain; charset=iso-2022-jp Content-Transfer-Encoding: 7bit 񤭤ޤ꡼򿩤٤ޤ ޤޤ > > | ī | quote head tail > quote head > tail | quote head | tail nbkenichi-bsfilter-f0a5a7c/test/testcases/folding000066400000000000000000000007471465373635000223120ustar00rootroot00000000000000Date: Sat, 23 Jul 2005 00:24:41 +0900 (JST) Message-Id: <20050723.002441.78702756@example.com> To: user@example.com Subject: =?iso-2022-jp?B?GyRCMkY1WSRfGyhC?= From: user@example.com Mime-Version: 1.0 Content-Type: Text/Plain; charset=iso-2022-jp Content-Transfer-Encoding: 7bit $B3(F|5-$r=q$-$^$7$?!#%"%$%9%/%j!<%`$r?)$Y$^$7$?!#:+(B $BCn$rJa$^$($^$7$?!#(B > $B0zMQ(B $B8~F|(B > $B0*(B | $B0zMQ(B $BD+(B | $B4i(B quote head tail > quote head > tail | quote head | tail nbkenichi-bsfilter-f0a5a7c/test/testcases/gb18030_base64_gb2312000066400000000000000000000007041465373635000237710ustar00rootroot00000000000000Date: Sat, 23 Jul 2005 00:24:41 +0900 (JST) Message-Id: <20050723.002441.78702756@example.com> To: user@example.com From: user@example.com Mime-Version: 1.0 Content-Type: text/plain; charset="GB2312" Content-Transfer-Encoding: base64 vX3I1dObpPKV+KStpN6kt6S/oaOloqWkpbmlr6XqqWCl4KTyyrOk2aTepLekv6GjwKUKs+ ak8rK2pN6kqKTepLekv6GjCj4g0v3TwyDP8sjVCj4gv/sKfCDS/dPDILOvCnwg7oYKCnF1 b3RlIGhlYWQKdGFpbAo+IHF1b3RlIGhlYWQKPiB0YWlsCnwgcXVvdGUgaGVhZAp8IHRhaW wK nbkenichi-bsfilter-f0a5a7c/test/testcases/header000066400000000000000000000011171465373635000221100ustar00rootroot00000000000000Received: from host3.example.com by host4.example.com (8.12.11/8.12.11) with ESMTP id abcdefgh Received: from host2.example.com by host3.example.com (8.12.11/8.12.11) with ESMTP id abcdefgh Received: from host1.example.com by host2.example.com (8.12.11/8.12.11) with ESMTP id abcdefgh Date: Sat, 23 Jul 2005 00:24:41 +0900 (JST) Message-Id: <20050723.002441.78702756@example.com> To: user@example.com Subject: test From: user@example.com Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit quote head tail > quote head > tail | quote head | tail nbkenichi-bsfilter-f0a5a7c/test/testcases/html000066400000000000000000000016541465373635000216320ustar00rootroot00000000000000Date: Sat, 23 Jul 2005 00:24:41 +0900 (JST) Message-Id: <20050723.002441.78702756@example.com> To: user@example.com Subject: test From: user@example.com Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="----=boundary_boundary" This is a multi-part message in MIME format. ------=boundary_boundary Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit plain_text_part ------=boundary_boundary Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: 7bit ABCDEF

fontsize0

fontsize1 displaynone in_atag IP address after_atag after_body after_html ------=boundary_boundary-- nbkenichi-bsfilter-f0a5a7c/test/testcases/iso_2022_jp_plain000066400000000000000000000007471465373635000240030ustar00rootroot00000000000000Date: Sat, 23 Jul 2005 00:24:41 +0900 (JST) Message-Id: <20050723.002441.78702756@example.com> To: user@example.com Subject: =?iso-2022-jp?B?GyRCMkY1WSRfGyhC?= From: user@example.com Mime-Version: 1.0 Content-Type: Text/Plain; charset=iso-2022-jp Content-Transfer-Encoding: 7bit $B3(F|5-$r=q$-$^$7$?!#%"%$%9%/%j!<%`$r?)$Y$^$7$?!#:+(B $BCn$rJa$^$($^$7$?!#(B > $B0zMQ(B $B8~F|(B > $B0*(B | $B0zMQ(B $BD+(B | $B4i(B quote head tail > quote head > tail | quote head | tail nbkenichi-bsfilter-f0a5a7c/test/testcases/iso_8895_1_plain000066400000000000000000000004521465373635000235530ustar00rootroot00000000000000Date: Sat, 23 Jul 2005 00:24:41 +0900 (JST) Message-Id: <20050723.002441.78702756@example.com> To: user@example.com Subject: rplca wtchz! From: user@example.com Mime-Version: 1.0 Content-Type: Text/Plain; charset=iso-8859-1 Content-Transfer-Encoding: 7bit Chp, Btifl and lgnt wtchs nbkenichi-bsfilter-f0a5a7c/test/testcases/lf000066400000000000000000000004161465373635000212620ustar00rootroot00000000000000Date: Sat, 23 Jul 2005 00:24:41 +0900 (JST) Message-Id: <20050723.002441.78702756@example.com> To: user@example.com Subject: test From: user@example.com Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit UNIX style. EOL is lf. nbkenichi-bsfilter-f0a5a7c/test/testcases/mbox000066400000000000000000000004231465373635000216240ustar00rootroot00000000000000From user@example.com From: user@example.com To: user@example.com Subject: subject1 body1 From user@example.com From: user@example.com To: user@example.com Subject: subject2 body2 From user@example.com From: user@example.com To: user@example.com Subject: subject3 body3 nbkenichi-bsfilter-f0a5a7c/test/testcases/mbox_cr000066400000000000000000000004231465373635000223100ustar00rootroot00000000000000From user@example.com From: user@example.com To: user@example.com Subject: subject1 body1 From user@example.com From: user@example.com To: user@example.com Subject: subject2 body2 From user@example.com From: user@example.com To: user@example.com Subject: subject3 body3 nbkenichi-bsfilter-f0a5a7c/test/testcases/mbox_crlf000066400000000000000000000004471465373635000226400ustar00rootroot00000000000000From user@example.com From: user@example.com To: user@example.com Subject: subject1 body1 From user@example.com From: user@example.com To: user@example.com Subject: subject2 body2 From user@example.com From: user@example.com To: user@example.com Subject: subject3 body3 nbkenichi-bsfilter-f0a5a7c/test/testcases/mbox_lf000066400000000000000000000004231465373635000223050ustar00rootroot00000000000000From user@example.com From: user@example.com To: user@example.com Subject: subject1 body1 From user@example.com From: user@example.com To: user@example.com Subject: subject2 body2 From user@example.com From: user@example.com To: user@example.com Subject: subject3 body3 nbkenichi-bsfilter-f0a5a7c/test/testcases/mime_b_iso_2022_jp000066400000000000000000000004201465373635000241140ustar00rootroot00000000000000Subject: =?iso-2022-jp?B?GyRCJCoyVjgrGyhC?= =?iso-2022-jp?B?IBskQkxUPWsbKEIgGyRCTFQ9axsoQg==?= To: user@example.com From: user@example.com Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit typical iso-2022-jp MIME-B encoding nbkenichi-bsfilter-f0a5a7c/test/testcases/mime_b_iso_2202_jp_typo000066400000000000000000000003121465373635000251670ustar00rootroot00000000000000Subject: =?iso-2202-jp?B?GyRCJCoyVjgrGyhC?= To: user@example.com From: user@example.com Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit iso-2202-jp is typo nbkenichi-bsfilter-f0a5a7c/test/testcases/mime_b_shift_jis000066400000000000000000000003141465373635000241500ustar00rootroot00000000000000Subject: =?shift-jis?B?gqiJ1Iyp?= To: user@example.com From: user@example.com Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit regal shift-jis MIME-B encoding nbkenichi-bsfilter-f0a5a7c/test/testcases/mime_b_shift_jis_bad000066400000000000000000000003451465373635000247620ustar00rootroot00000000000000Subject: =?iso-2022-jp?B?gqiJ1Iyp?= To: user@example.com From: user@example.com Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit mime-b charset is is-2022-jp encoded code is shift-jis nbkenichi-bsfilter-f0a5a7c/test/testcases/mime_delimiter_bug000066400000000000000000000013041465373635000245000ustar00rootroot00000000000000Date: Mon, 12 Feb 2007 14:51:50 +0900 (JST) Subject: testcase for revision 1.84 Mime-Version: 1.0 Content-Type: Multipart/Mixed; boundary="--Next_Part(Mon_Feb_12_14_51_50_2007_244)--" Content-Transfer-Encoding: 7bit ----Next_Part(Mon_Feb_12_14_51_50_2007_244)-- Content-Type: Text/Plain; charset=iso-2022-jp Content-Transfer-Encoding: base64 GyRCMyhGfDUtJHI9cSQtJF4kNyQ/ISMlIiUkJTklLyVqITwlYCRyPykkWSReJDckPyEjOisbKEIN ChskQkNuJHJKYSReJCgkXiQ3JD8hIxsoQg0KPiAbJEIwek1RGyhCIBskQjh+RnwbKEINCj4gGyRC MCobKEINCnwgGyRCMHpNURsoQiAbJEJEKxsoQg0KfCAbJEI0aRsoQg0KDQpxdW90ZSBoZWFkDQp0 YWlsDQo+IHF1b3RlIGhlYWQNCj4gdGFpbA0KfCBxdW90ZSBoZWFkDQp8IHRhaWwNCg== ----Next_Part(Mon_Feb_12_14_51_50_2007_244)---- %% garbage %% nbkenichi-bsfilter-f0a5a7c/test/testcases/mime_q_iso_2022_jp000066400000000000000000000003221465373635000241340ustar00rootroot00000000000000From: =?ISO-2022-JP?Q?=1B=24BF=3FL=3E=1B=28B=20anonymous=20=1B=24B=255=25s=25W=25k=1B=28B?= Subject: =?UTF-8?Q?=E3=80=81wo?= =?UTF-8?Q?rd=E3=81=AA=E3=81=A9=E3=80=91=E7=89=B9=E5=88=A5?= nbkenichi-bsfilter-f0a5a7c/test/testcases/multi_subject000066400000000000000000000005051465373635000235310ustar00rootroot00000000000000Date: Sat, 23 Jul 2005 00:24:41 +0900 (JST) Message-Id: <20050723.002441.78702756@example.com> To: user@example.com Subject: subject1 Subject: subject2 From: user@example.com Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit quote head tail > quote head > tail | quote head | tail nbkenichi-bsfilter-f0a5a7c/test/testcases/no_body000066400000000000000000000003751465373635000223160ustar00rootroot00000000000000Date: Sat, 23 Jul 2005 00:24:41 +0900 (JST) Message-Id: <20050723.002441.78702756@example.com> To: user@example.com Cc: user@example.com From: user@example.com Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit nbkenichi-bsfilter-f0a5a7c/test/testcases/no_boundary000066400000000000000000000004651465373635000232040ustar00rootroot00000000000000Date: Sat, 23 Jul 2005 00:24:41 +0900 (JST) Message-Id: <20050723.002441.78702756@example.com> To: user@example.com Cc: user@example.com From: user@example.com Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit quote head tail > quote head > tail | quote head | tail nbkenichi-bsfilter-f0a5a7c/test/testcases/sjis_base64_iso_2022_jp000066400000000000000000000007111465373635000250030ustar00rootroot00000000000000Date: Sat, 23 Jul 2005 00:24:41 +0900 (JST) Message-Id: <20050723.002441.78702756@example.com> To: user@example.com From: user@example.com Mime-Version: 1.0 Content-Type: text/plain; charset="ISO-2022-JP" Content-Transfer-Encoding: base64 ikeT+otMgvCPkYKrgtyCtYK9gUKDQYNDg1iDToOKgVuDgILwkEiC14LcgrWCvYFCjakKko 6C8JXfgtyCpoLcgrWCvYFCCj4giPiXcCCM/JP6Cj4giKgKfCCI+JdwIJKpCnwgiucKCnF1 b3RlIGhlYWQKdGFpbAo+IHF1b3RlIGhlYWQKPiB0YWlsCnwgcXVvdGUgaGVhZAp8IHRhaW wK nbkenichi-bsfilter-f0a5a7c/test/testcases/sjis_base64_iso_2202_jp_typo000066400000000000000000000007111465373635000260560ustar00rootroot00000000000000Date: Sat, 23 Jul 2005 00:24:41 +0900 (JST) Message-Id: <20050723.002441.78702756@example.com> To: user@example.com From: user@example.com Mime-Version: 1.0 Content-Type: text/plain; charset="ISO-2202-JP" Content-Transfer-Encoding: base64 ikeT+otMgvCPkYKrgtyCtYK9gUKDQYNDg1iDToOKgVuDgILwkEiC14LcgrWCvYFCjakKko 6C8JXfgtyCpoLcgrWCvYFCCj4giPiXcCCM/JP6Cj4giKgKfCCI+JdwIJKpCnwgiucKCnF1 b3RlIGhlYWQKdGFpbAo+IHF1b3RlIGhlYWQKPiB0YWlsCnwgcXVvdGUgaGVhZAp8IHRhaW wK nbkenichi-bsfilter-f0a5a7c/test/testcases/sjis_plain_iso_2022_jp000066400000000000000000000006131465373635000250230ustar00rootroot00000000000000Date: Sat, 23 Jul 2005 00:24:41 +0900 (JST) Message-Id: <20050723.002441.78702756@example.com> To: user@example.com From: user@example.com Mime-Version: 1.0 Content-Type: Text/Plain; charset=iso-2022-jp Content-Transfer-Encoding: 7bit GL܂BACXN[Hׂ܂B ߂܂܂B > p > | p | quote head tail > quote head > tail | quote head | tail nbkenichi-bsfilter-f0a5a7c/test/testcases/utf8_base64000066400000000000000000000006411465373635000227130ustar00rootroot00000000000000Message-Id: <20050723.101812.92590895.example@example> To: user@example.com Subject: =?utf-8?B?w6LDqsOuw7TDuyDjgrfjg6Pjg7zjg5njg4Pjg4g=?= From: user@example.com Mime-Version: 1.0 Content-Type: Text/Plain; charset=utf-8 Content-Transfer-Encoding: base64 DQrntbXml6XoqJjjgpLmm7jjgY3jgb7jgZfjgZ/jgILjgqLjgqTjgrnjgq/jg6rjg7zjg6DjgpLp o5/jgbnjgb7jgZfjgZ/jgILmmIYNCuiZq+OCkuaNleOBvuOBiOOBvuOBl+OBn+OAgg0Kw6LDrsO7 w6rDtA0K nbkenichi-bsfilter-f0a5a7c/test/testcases/utf8_plain000066400000000000000000000007341465373635000227350ustar00rootroot00000000000000Date: Sat, 23 Jul 2005 00:24:41 +0900 (JST) Message-Id: <20050723.002441.78702756@example.com> To: user@example.com Subject: =?iso-2022-jp?B?GyRCMkY1WSRfGyhC?= From: user@example.com Mime-Version: 1.0 Content-Type: Text/Plain; charset=utf-8 Content-Transfer-Encoding: 8bit 絵日記を書きました。アイスクリームを食べました。昆 虫を捕まえました。 > 引用 向日 > 葵 | 引用 朝 | 顔 quote head tail > quote head > tail | quote head | tail