Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations SkipVought on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Excel: Importing European account information (German bank spreadsheets are the worst.). 2

Status
Not open for further replies.

jlockley

Technical User
Nov 28, 2001
1,522
0
0
US
I've just spent a week figuring out how to best import German and Spanish bank statements into Excel in America. Some European statements import well, others less so. (In this case Deutsche Bank and Banco Sabadell). As I've noticed in researching the problem that is not uncommon, I decided to put it here.
Note: Legacy import is necessary if you are using Office 365. It can be enabled under Options/advanced.

How to convert German .csv or .txt files to Excel
:
1. Import as cvs. Open in Notepad..
2. Save File as Ansii (European files are generally encoded as UTF 8, which does not import well into US spreadsheets. )
3. Import Ansii file into Excel. Although the single transactions will import to separate rows, most of their data will be in one column along with its delimiters.
4. Select that Column. In DATA collection find "convert text to columns". Specify the delimiters, generally semicolons, and text markers.
5. The text was not delivered with the same number of columns. Depending on the kind of transaction – sepa, deposit, automat, etc..there may be extra columns with location or IBAN information. You will not need these for accounting purposes. They will be in blocks. (not choosing to treat consecutive delimiters as one may resolve this. I have found they just make a worse mess.)
6. Use insert or delete functions (right click) to align columns.
7. Use formulas (=TRIM, =CLEAN, and substitute ) to strip non printing characters and extra spaces from the data.
8. Copy numerical columns, right click and choose “paste special”. Chose “values” from the interface. Paste.
9. Check that the figures are numerical by formatting numbers to show negative values in red.
10. Rarely numbers persist in text format. Multiplying by a numerically formatted 1.00 generally resolves that.
11. Note: For some reason numbers imported and formatted in this matter may not calculate correctly with autosum or the +SUM() function. Double checking with a clipboard manager and the calculator is advised.
Other considerations.
1) It has been said that Libre Office plays more nicely with European files. I've tried and I have doubts.
2) It may help to set your OS regional and language settings to the country and language of the data’s origin.
2) Check that your data has been delivered with periods as decimal points and commas as thousand separators. If you cannot change this with formatting you will have to check that the region is the US. (American 1,267.89 EU is German or Spanish 1.267,89.
 
Wait, wait, wait. I think you are over-complicating things for yourself there. DO NOT save UTF8 as ANSI. [nosmiley]

Proper way to import any non-US-standard CSV into Excel:
[ol 1]
[li]Open Excel with a new, blank workbook[/li]
[li]switch to "Data" tab[/li]
[li]in area "External data" click "From Text"[/li]
[li]browse to the csv file[/li]
[li]in the upcoming dialog, specify the encoding as "UTF8", delimiter as semicolon (German), Text qualifier if applicable[/li]
[li]specify data type for each column if required[/li]
[li]complete import[/li]
[/ol]

You will find that this - despite seemingly many steps - is A LOT faster than what you are doing, without corrupting so much data in the process.
If this works out for you, we can talk about simplifying/automating THAT process... ;-)

Cheers,
MakeItSo

"Knowledge is power. Information is liberating. Education is the premise of progress, in every society, in every family." (Kofi Annan)
Oppose SOPA, PIPA, ACTA; measures to curb freedom of information under whatever name whatsoever.
 
I agree with MakeItSo...

Last month I learned about the UTF8 option for imports and it has sorted out a lot of the character issues I was having. Before, I had to do a lot of find and replace to fix the characters that wouldn't render properly like em dashes and other symbols. Recently, I received a file that contained a character that when imported turned into 4 characters. I tried my usual find and replace routines and it would only fix 3 of the 4 so that I had the right character and a british pound symbol when all I needed was the right character. Did some searching about find and replace options thinking I would have to use an external program, when I came across a thread mentioning UTF-8. After creating a new import spec, the data imported and looked clean without funny characters.
 
To add to this: in cases when there are even line breaks within cell contents, I found using LibreOffice for opening the .csv much, much more reliable.
Saving as .xlsx from there made subsequent handling in Excel hassle-free.

"Knowledge is power. Information is liberating. Education is the premise of progress, in every society, in every family." (Kofi Annan)
Oppose SOPA, PIPA, ACTA; measures to curb freedom of information under whatever name whatsoever.
 
MakeITSo:

You are absolutely correct except that these are not US files, and experimentation has shown that they do not import as UTF. One of the issues, at least for me, was that the delimiters imported as non printing characters, making conversion to numerical format impossible. sxchech: It did not function correctly for me, but I have heard the same thing.
While files from Spain import with far fewer obstacles, those from Deutsche Bank have been a nightmare for me for years. Oddly, DB is not aware or interested in the issue, despite being at the moment one of the most recognized international banks. I have worked with data imported both from the German and the English site.
Just for fun I can try to re-import without translating into ansii, bit at least my version of Excel imports with ansii by default, ergo the need for the legacy import. The above system, even if it a bit more complicated, delivers the data in an organized format.
 
Hi jlockley,

Excel imports with ansii by default
Excel opens csv as ansi by default, assuming the delimiter based on OS specs.
When importing via Data=>External data=>From text, you are free to adjust these settings:

xl_import_aarczu.jpg


The red cross is at the "file origin" drop down. "65001: Unicode (UTF-8)" is close to the bottom of the list as it is alphabetically sorted.

"Knowledge is power. Information is liberating. Education is the premise of progress, in every society, in every family." (Kofi Annan)
Oppose SOPA, PIPA, ACTA; measures to curb freedom of information under whatever name whatsoever.
 
That is not quite the interface I see (I have it set to English right now, but otherwise that doesn't seem to matter) but just for fun, here is jpg of the file as it imports from .csv format. The interface is set to ansii. The not on download says the file is comma delimited, but looking at the next download show, it is semicolon delimited.
 
 https://files.engineering.com/getfile.aspx?folder=5fb45e70-0659-4d50-a049-c218e20247a6&file=skip_2.txt
Same file as txt.
Note a couple of oddities. First, while the debits are introduced with ", the single deposit has none. This is true of all deposits, which offsets those initial cells.
Take this line.
"01/21/2019;01/21/2019;""SEPA-Direct Debit"";Telefonica Germany G;OTHR Sonst. Transakt Kunde DE11356897 ",Rechnung M211190213445103 ,O2 sagt Danke;DE70200300000000297622;HYVEDEMM300;000623683213;T0030001000000DE11356897;DE9700000000142462;;;;;;-24.89;;EUR,,,,
The ; delimiter seems to be inside the quotation marks, which are used only on the first columns of debits. The commas [highlight #FCE94F],O2 sagt Danke,[/highlight] Also appear to be delimiters. Choosing both ; and , as delimiters has no impact.

So let convert the first text to columns.


 
 https://files.engineering.com/getfile.aspx?folder=0c9fe8bd-32c8-4011-ad93-288a370e2d70&file=DB_sample.txt
Two of the most obvious issues:
1) Most of the amounts are missing.
2) The transaction type, etc, which should be two columns, is duplicated.
So, Normal doesn't seem to be an option here.

A further notes. Since the data inserts IBan's and customer number on some without compensating for extra fields on those transactions not using IBAN'ss etc, many entries end up offset by several columns. Excel also imports the quotation marks as non printing characters, so that a thorough cleanup is required. Obvsiously, if they are text, they are not functioning as separators.

I noticed that using my above solution does not work all the time (but it is an improvement.) Importing a larger amount of data resulted in the final entries being completely out of sync.

Go figure.
 
 https://files.engineering.com/getfile.aspx?folder=b5bae0ad-823e-4b1c-a69f-4ce2e3ac8111&file=failed_to_convert_to_colums_DB.JPG
You are still not properly importing the file, only opening it.
Do not "convert to columns". This is the wrong way!
The txt file is UTF 8, not ANSI. You must therefore import as UTF 8, specify semicolon as delimiter and double quote as text qualifier.


"Knowledge is power. Information is liberating. Education is the premise of progress, in every society, in every family." (Kofi Annan)
Oppose SOPA, PIPA, ACTA; measures to curb freedom of information under whatever name whatsoever.
 
by looking at your sample files they seem like something that is done by parsing a PDF file and extracting the data - please tell me this is not so and that the files do come from the banks directly on their own format.

If it is indeed from pdf to txt this can happen. there are ways around some of the issues, but not always easy or pretty to automate.

Otherwise if it is from a bank directly (extracted as .txt or .csv) can you please post a unedited file - you messing up with excel or whatever else to manipulate it may be what is causing your files to "look" wrong.

Regards

Frederico Fonseca
SysSoft Integrated Ltd

FAQ219-2884
FAQ181-2886
 
... but be careful what you post because files "from a bank directly" may contain pretty sensitive (read: secret) information...


---- Andy

There is a great need for a sarcasm font.
 
fredericobonseca: I agree, but this information was downloaded directly as a .csv file.
Andrezejek: Absolutely. Much of the information has been changed.
At any rate I take off for Berlin again tomorrow, where it will be easier to harrass tech. I have found, unfortunately, that the system I made work once is not reliable very time. And yes, it is UTF 8, which is why saving it to ansii worked. Importing it from UTF 8 was not ideal. (The delimiters imported as non printing text.)
You would think that an international bank would have more universally effective data sharing.










 
> can you please post a unedited file

I'm with Frederico on this - we need the unadulterated original to be able to help further, before you manipulated it at all.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top