ChiefsPlanet

ChiefsPlanet (https://www.chiefsplanet.com/BB/index.php)
-   Nzoner's Game Room (https://www.chiefsplanet.com/BB/forumdisplay.php?f=1)
-   -   Misc NFT - Anybody here provide (or know of ) data scrubbing services? Referral $$$ paid (https://www.chiefsplanet.com/BB/showthread.php?t=287559)

Trivers 10-17-2014 03:02 PM

NFT - Anybody here provide (or know of ) data scrubbing services? Referral $$$ paid
 
Would prefer to spend $$$ with fellow CPers.

We have 43 databases averaging 50K each that need to be cleaned up.

Seeking consultant.

Referral $$$ paid to your paypal account

Thanks

58-4ever 10-17-2014 03:08 PM

Let me know what kind of services you need. Are you in the KC Metro?

stumppy 10-17-2014 03:08 PM

I got a gal that comes by when I call and cleans my pipes. She's pretty talented, I'm sure she could clean your databases too.......while wearing a french maid outfit.

58-4ever 10-17-2014 03:14 PM

Shoot me a PM

stonedstooge 10-17-2014 03:15 PM

Can we keep your porn?

Rain Man 10-17-2014 04:13 PM

Cleaned up how?

CaliforniaChief 10-17-2014 04:15 PM

http://www.troll.me/images/jesse-pin...nets-bitch.jpg

TLO 10-17-2014 04:17 PM

This thread had potential.... but alas.

srvy 10-17-2014 05:09 PM

Better call Saul.

srvy 10-17-2014 05:12 PM

If its really bad The Wolf.

<iframe width="560" height="315" src="//www.youtube.com/embed/IgzFPOMjiC8" frameborder="0" allowfullscreen></iframe>

AustinChief 10-17-2014 05:14 PM

Gonna need more details. Are you looking at something that could be automated or something that requires a manual once-over?

Trivers 10-24-2014 03:01 PM

Thank you for your responses.

Here are the details:

We are sending emails to list of insurance agents in all fifty states.

The databases are public records.

The first list we need cleaned up is from WisCONsin. (How the natives pronounce it.)

http://oci.wi.gov/agentlic/agntlist.shtml

Scroll down to bottom of page:

Format 2 - Agents by Company Appointments

There are 17 files. First one is al_1st-al.exe. We figured out how to separate columns, and remove the records without email addresses.

1) We need all the first and last name duplicates removed; and 2) words turned into lower cases.

We would prefer a way to automate this whole process.

If interested, please PM.

Thanks

unlurking 10-24-2014 04:20 PM

*1) uniq file > newfile
2) tr '[:upper:]' '[:lower:]' < file > newfile

*(Are you sure you want to remove dupes by name and not just the entire line? If there are two John Smith entries with different info you will lose one. Might be better to remove complete dupe lines instead?)



</pre>

Trivers 10-24-2014 04:34 PM

Quote:

Originally Posted by unlurking (Post 11046834)
*1) uniq file > newfile
2) tr '[:upper:]' '[:lower:]' < file > newfile

*(Are you sure you want to remove dupes by name and not just the entire line? If there are two John Smith entries with different info you will lose one. Might be better to remove complete dupe lines instead?)



</pre>


Good catch! Yes.

Thank you!

unlurking 10-24-2014 04:51 PM

If you have a bash shell, this will work...

Code:

cat infile | cut -c101-135,136-160,350-420 | tr '[:upper:]' '[:lower:]' | sed 's/ \s\+/,/g' > outfile
uniq infile outfile

EDIT: This was tested using the "al_1st-al.exe" file from the site you linked to. Only 8 dupe lines are dropped.


All times are GMT -6. The time now is 09:10 PM.

Powered by vBulletin® Version 3.8.8
Copyright ©2000 - 2024, vBulletin Solutions, Inc.