I'm reading data from sets of CSV source files with " text qualifiers. Most of the data is good, but I do run into some bad data that doesn't seem to get handled by redirecting rows. The type of data that is killing me is this:
"Business ID","Business Name","Business Phone"
"12345",""A" SPOT, THE ","555-555-5555"
Access sees this as an unparsable record and continues with the rest of the data. SSIS seems to choke on it. I've set the package to allow 1000 errors with no change.
I did set up an error file from my source file, with error output set to redirect row on error and on truncation. The error output seems only to show columns for "Flat File Source Error Output Column", "ErrorCode", and "ErrorColumn" too. (I'm sure there's a good reason, but I've found it really frustrating that it doesn't use the source file column mappings.)
What I'd like to do is redirect (or fix) the bad rows and continue with the rest of the load.
The Execution Results tab shows the following errors.
[Flat File Source [2082]] Error: The column delimiter for column "Business Phone" was not found.
[Flat File Source [2082]] Error: An error occurred while processing file "\\Share\Files\flat_file.csv" on data row 464.
[DTS.Pipeline] Error: The PrimeOutput method on component "Flat File Source" (2082) returned error code 0xC0202092. The component returned a failure code when the pipeline engine called PrimeOutput(). The meaning of the failure code is defined by the component, but the error is fatal and the pipeline stopped executing.
[DTS.Pipeline] Error: Thread "SourceThread0" has exited with error code 0xC0047038.
[DTS.Pipeline] Error: Thread "WorkThread1" received a shutdown signal and is terminating. The user requested a shutdown, or an error in another thread is causing the pipeline to shutdown.
[DTS.Pipeline] Error: Thread "WorkThread0" received a shutdown signal and is terminating. The user requested a shutdown, or an error in another thread is causing the pipeline to shutdown.
[DTS.Pipeline] Error: Thread "WorkThread1" has exited with error code 0xC0047039.
[DTS.Pipeline] Error: Thread "WorkThread0" has exited with error code 0xC0047039.
Any help would be much appreciated!
two years past...I encounter the same problem.
Just output the source file to other format: tab as delimiter, execl..... to skip the problem.
|||Hi lund,This is because you were using " as text qualifier and there was a data that had value just the same as the text qualifier. The solution is you should replace the " of your data value with another character like '. Or just use another text qualifier.
Best regards,
Hery|||
The other two posters are correct. If you have any control over the source file consider changing the delimeter (to tabs) and/or the text qualifier (to something that doesn't appear in your text.)
Failing that, I have had some success in the past in pre-processing files. Basically you can load the file line by line, treating each line as a single record, and do some processing on the line to clean it.
Then you output the clean lines back into a new file.
Processing could be simple (eg count the number of commas in a row, or verify that if you split a row you get a set number of columns, and they are the correct width.)
Preferred approach is to fix the source data to export to something other than comma separated, though. There is a good reason that .tab files were invented!
Dylan.
No comments:
Post a Comment