NAME undupe-mail-body - Print only the non duplicate messages based on body. SYNOPSIS undupe-mail-body -help undupe-mail-body [mbox2 , [mbox3 , ... ]] undupe-mail-body mbox-with-body-dups > mbox-without-body-dups DESCRIPTION I found at least two ways -- procmail(1) and mutt(1) -- which can delete duplicate messages based on the "Message-ID" header. Failed i to find anything which would delete messages based on duplicate BODIES. Given *mbox*-format mailboxes, this program prints, on *standard out*, only those messages which are unique based on only the body. Original mailbox is accessed only for reading. Only the first encountered instance (of multiplicates) is retained. Incorrect start of email found For some of the messages in a mailbox, which otherwise load up fine in mutt(1), Mail::Mbox::MessageParser indicates "Incorrect start of email found". Turning off "enable_cache" (and "enable_grep") on the first run, or a rerun with "enable_*" options turned on does not cause "Incorrect start of email" to be printed. So, please do not be alarmed (like i did) if the above happens. OPTIONS help Show help message. ordered Keep the order of output same as input, minus any duplicates. Currently, it is a hard coded value. TO DO Allow *ordered* option to be set on command line. I would like this program to be a filter such that it gathers the input on *standard in* in addition. This can be achieved by giving *STDIN to "Mail::Mbox::MessageParser->new()". BUGS * After building a cache for the first time for a mailbox, same SCALAR reference is printed (via "printf STDERR ..." in get_body_digest()) for all the messages. Any subsequent runs produce the expected output. AUTHOR, LICENSE, DISTRIBUTION, ETC. Parv, parv(at)pair(dot)com Modified: Nov 17 2005 This software is free to be used in any form only if proper credit is given. I am not responsible for any kind of damage or loss. Use it at your own risk.