Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Mike Lewis on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Company Listbox to fill with companies worldwide listed at stock markets 4

Status
Not open for further replies.

german12

Programmer
Nov 12, 2001
563
DE
I would like to fill a list box with as many companies as possible that are listed on the world's stock exchanges.
Unfortunately, I haven't found any sources on the internet that would make it as easy as possible to fill a VFP file.
The name of the company would be good -. if the securities identification number and/or the industry were also included - all the better.
The VFP file is intended to serve as a reference to find all stock market reports from a stock market magazine, which are collected in a large memo file.
I can easily add to this memo file every week.

This means, for example - after entering "Rheinmetall" all reports about Rheinmetall should appear. This would make storing the stock market magazine unnecessary and information about a searched company could be available much more quickly.
This would also make it easier to assess a recommended stock if, in addition to the current assessment, you can also see what has been said about this company in the past.

Maybe someone knows a good source of such companies.
Maybe there is someone who has done such coding before.

Of course, such a project would not only be useful for stock market transactions; there would be many possibilities to make text searches easier in other areas as well. VFP is particularly fast for this...

thanks in advance
Klaus

Peace worldwide - it starts here...
 
I know this idea comes from a suggestion I already made in your old long thread.

That has 2 points, the minor point is:
If you have a table from which you need to SELECT DISTINCT COMPANY why don't you have a table with the distcint names instead of condensing the list of names, there's no value in storing repeated values.

And since your last questions in general were about stock exchange rates and company news, well, that cries out for having a tbale of companies.

So you take the idea one or two levels up, I think google could find something, but mainly the idea is you get from your unstructured data you have to a more structured storage, something I'd call an organically grown database, driven by the challenges and needs you have. You don't need to reach out for the full list of all companies in the world, though, what would it help you to have hundreds of thousands of companies of which you don't have news nor stock exchange rates?

So I'd just statrt with the result of the SELECT DISTINCT COMPANY query as its own table. Add at least one column, the id with autoincrementing integer or guid. You might add some columns, but only about single properties, for example stock ticker symbols, if a company is on the stock market. Other data like articles about a company would go into an article table that would refer to the company by its id. That's in very short data normalization in the most often used 1:n relationship of on1 head record to which many (n) details records exist, like articles, stock exchange rates history, whatever else, with tables for each aspect.

If you have that company table, a listbox or also a combobox is easier done with using the full table as rowsource, isn't it? And that's not the onliest advantage.

Chriss
 
Chriss, I think I've been misunderstood.
Of course I want to continue working with a file that does not contain duplicate company names.
The SQL command was only necessary to create such a table.
You can see why this was necessary in the photo below.
20240313_085757_z89ejb.jpg


They are the tables of contents from 2 issues of a magazine, and I didn't feel like entering them manually in a table, they were already available to me digitally in this form.
As you can see, it would be difficult to see where a name repeats itself.
So I transferred both tables of contents using drag and drop and to avoid having to check where a name is repeated, I filled my listbox with "select distinct...".
That only lasted a fraction of a second. Even with 100 issues, this would be the fastest way to get a file without duplicate entries.
As long as I can't load a complete file with company names, this is a way to very quickly improve such a file with each new issue of the magazine.
This brings me to what is intended - namely to obtain a listbox reference file for searching for information that even relates to the magazine whose information interests me.
I am really only interested in the information of that magazine to concentrate it by company - and to see what has happen or to see what has been said, especially when there
is am recommendation to buy a title, which I had not read about in the past.

I hope that is more clear now.

Klaus


Peace worldwide - it starts here...
 
Klaus said:
to avoid having to check where a name is repeated, I filled my listbox with "select distinct...".

That is understood, Klaus.

Klaus said:
The SQL command was only necessary to create such a table.
But there is your misunderstanding. When you use such a query, you create a result cursor, that VFP forgets. It's not becoming a permanent DBF file, so you do this over and over every time.

You argue...
Klaus said:
Even with 100 issues, this would be the fastest way to get a file without duplicate entries.
No, having such a table is always faster than needing to create it. Keeping your list with duplicates is bad, not only because it needs a neglectible time. But you can't even begin to store all other data related to a company to a unique id or record number. So that's not the way to go.

While it does only take split seconds even with 1000s of rows, you better keep track of a list of companies in a DBF file that you don't need to create by query over and over again.

Once you have a list of distinct names, for which your query is perfectly valid, you should make that result a dbf file that you keep and instead of binding a listbox to that query, you can bind it to that table.

Code:
CREATE DATABASE stockdata.dbc
CREATE TABLE companies.dbf (id INTEGER AUTOINC PRIMARY KEY, company varchar(40))
INDEX ON company TAG company CANDIDATE

INSERT INTO companies (company) SELECT DISTINCT company FROM corplook1 WHERE NOT EMPTY(company)
* USE DBF('corplook1') in corplook1 EXCLUSIVE
* ZAP in corplook1 && uncomment, if you understand this empties your list with duplicates and why to do so
CLOSE ALL

To bind that to a listbox you only need to set - and this time I use Mikes recommendation of RowSourceType 6 = Fields:
Code:
OPEN DATABASE stockdata.dbc && this could be done in main.prg
USE stockdata!companies ORDER company SHARED && this could be done in the form load event, or with the DE of the form, including the setting by which index to sort
* set listbox.RowSourceType=6
* set listbox.RowSource = 'companies.company'

You may now ask: What about new companies?

Well, you can continue to create and fill the corplook1.dbf with companies of a new magazine issue, then add only new companies into your permanent companies.dbf, with:
Code:
INSERT INTO companies (company) SELECT DISTINCT company FROM corplook1 where !empty(company) and company NOT in (Select company FROM companies)
* USE DBF('corplook1') in corplook1 EXCLUSIVE
* ZAP in corplook1 && uncomment, if you understand this empties your list with duplicates and why to do so

The candidate index my code creates does not allow two company records with the same name, so it's impossible for duplicates to get into the dbf with such a candidate index.

And once you have transferred the distinct names into your permanent copmpanies.dbf you can delete the list of companies you already have processed, for that reason there's the ZAP part. I commented it because I don't know whether your corplook1.dbf contains more fields with data you don't want to delete. But this list is only necessary temporary for getting in really new names you never had beforehand.

You shouldn't want unstructured data with duplicates as your permanently stored data. That never will be a good basis of a database with many more tables to come that all should refer to a unique company id.

If you don't do this, For example the growing list of companies in the corplook1.dbf does not only need longer to be queried, the list changes and companies will get new list positions, when new names arise. And that's making it impossible to extend from there. What you need is a permanent unique identifier for every record to have a reliable and reusable reference that is stable from the moment it is assigned and so does not become invalid. You don't want your articles about IBM suddenly refer to another company just because a new company "Best Inc." (or any company with a name before IBM) changes the position of IBM in the list one down.

What you also shouldn't do is refer to a company in every table by the company name, just because it's short enough. Also not because a name is never changing (and imagine it does change nevertheless) it is not at all a good idea to relate data to the name of copanies, you give companies identifiers, in general you give a unique identifier to any item, any record, that's something to put into your routinely thinking about tables.

Also, you see: You don't need a physically ordered list to get a sorted list, we already showed that an index is the best thing to use for that in older threads. So no matter in which order companies go into a companies.dbf, the index on the company name sorts it.

Chriss
 
I don't know where you'd find such a list of companies online, but assuming you can find it, you should be able to just copy and paste the content into a text file. Then, you can write some fairly simple code to read in the file, break it into lines and pop the data from each line into a table. Something like (and this is just quick & dirty):

[pre]local cListData, aCompanies[1], nCompanies

cListData = FileToString('YourCompanyFile.TXT')
nCompanies = ALINES(aCompanies, m.cListData)

FOR nCompany = 1 TO m.nCompanies
INSERT INTO YourTable VALUES(aCompanies[1])
ENDFOR
[/pre]

Obviously, if there's more in the file than just the names, you'd need to do a little parsing of each line, but ALINES() will do that for you, too.

Oh, and if you have SnagIt, you might try just grabbing those images you posted and having SnagIT do OCR on them. I just learned recently that it can do that (and I've been using it for more than 30 years, I think).

Tamar
 
Tamar & Chriss,
2 stars for you....
Thank you very much for your really helpful and educational contributions.
Tamar, I don't have the Snag it program - it costs around 200 euros in Germany - but I like to think it's worth it for professionals.
Your other tip is also
very helpful.
I've already found a list, but I need to take a closer look at it.
and if you click on "Continent", then around 10,000 titles are displayed - even with industry name, country and further information (dividend) for Europe, North- and South America....

Chriss, I agree with you about the clarity of the data,
And I can now well imagine that indexing has such a high value for later processing.
I now need some time to research further what I can actually use in the end, but so far I'm very impressed with the help in this forum - and not just in this case.

Klsus

Peace worldwide - it starts here...
 
Thanks for your kind words, Klaus.

One more reminder to be cautious with anything you do to determine a permanent list of companies. It's impressive what you found, but as complete as it is, you'l one day find an article about a new company not yet listed, plus the much more comprehensive list of all (bigger) companies around the world would likely include companies never mentioned in your magazines, so why bother having a complete list? And another caution to tak is if you would update this list by using the same source of information, reusing an import routine, take care to not change any IDs you already generated, so establishing a first list and maintaining it are separate issues. In the ideal case you already care for duplicates and IDs that are not only unique but stay the same for the same companies.

So think about whether you want to make it a "one off" job (done once) to establish a comprehensive first list. Or let this list grow with each issue you process your way. Both approaches have their pros and cons, you surely make a bigger leap from zero to a list with an import from an online resource and then can also switch to maintain it manually from then onwards, with rare changes necessary.

Chriss
 
I don't have the Snag it program - it costs around 200 euros in Germany

You don't have to use SnagIt. There are several free or low-cost screen capture / OCR utilities around, for example Capture2Text and TextShot. If you do a search for "free screen capture and OCR", you will probably turn up some more.

Mike

__________________________________
Mike Lewis (Edinburgh, Scotland)

Visual FoxPro articles, tips and downloads
 
Well, to add to the topic of OCR: As you said you can perfectly get text from the for of these as eMgazines you have, too, there's no need for text recognition of a scan or photo at all.

Chriss
 
Hello,

I have done something similar and did it with getting the data from internet

So basically :
Find the urls with data of interest , for example and store them into a table
do a loop with :
Download page to file
Declare Long URLDownloadToFile In URLMON.Dll LONG, String, String, Long, Long
nresult = URLDownloadToFile(0, curl, cftemp, 0, 0)

filetostr() the file

search for keywords ("tags") with AT(), getwordnum(),..
For example 'itemprop="price" content="' as a keyword should give you the price in the next 10 letters
You may also look for "MSXML2.XMLHTTP" - "GET" - getElementsByTagName instead of AT.... or have a look on chilkats html class
Store the information into a structured table

Hint
Using AT.. sometimes you have to search (maybe more then once) for "starter keyword" and then take a string from where you find it and search from there for the "main keyword". Thats because the "main keyword" may occur more then once, then for search take the rest of the filetostr() after the "starter keyword"

Of course its only a snapshot and there may be dynamic content not included inside the file. Just test it with different sites.

HTH
Tom
 

Thanks for the comment, Chris.
It's true - digital information is already available here.
The tip is still useful because this problem can occur in many other cases where there is no digitized data.

Klaus



Peace worldwide - it starts here...
 
Tom: So basically : Find the urls with data of interest said:
https://www.boerse.de/aktien/Rheinmetall-Aktie/DE0...[/URL] and store them into a table]

Thanks for your interesting comments, Tom
This would later be an opportunity to obtain additional information about stock market events from the Internet.
I tried your link and noticed that, for example,
for stocks in Germany, the address to be entered always comes first
must be called manually. This is of course very time-consuming.
If you had a list of securities identification numbers,
then VFP could generate these addresses itself.
For example, these 3 addresses are as follows:

[highlight #FCE94F]BAY0017[/highlight]
[highlight #FCE94F]8404005[/highlight]
[highlight #FCE94F]A0KEXC7[/highlight][highlight #FCE94F][/highlight]

As you can see, VFP could form the full URL itself as the text on the left side of the URL is always the same.
if you have a list of ID numbers of these papers.
It will be similar with US stocks, and therefore you could get details from VFP with your code for a lot of papers.

But as I said, that would be a project for later.
Thank you

Klaus




Peace worldwide - it starts here...
 
Hello Klaus,

thanks for the tip.
At the moment we are doing it with another site, but I keep that in mind.

Regards
tom

Btw.: Vectron's WKN is A0KEXC, the 7 maybe an indicator from boerse.de which stock exchange to use
 
Tamar, just a small question
in your code above you wrote this loop

local cListData, aCompanies[1], nCompanies

cListData = FileToString('YourCompanyFile.TXT')
nCompanies = ALINES(aCompanies, m.cListData)

Loop
FOR nCompany = 1 TO m.nCompanies
INSERT INTO YourTable VALUES(aCompanies[1])
ENDFOR

I tried it, but that only filled Yourtable with allways the first line of the textbox.

I changed it then a little bit

FOR nCompany = 1 TO m.nCompanies
INSERT INTO Yourtable VALUES(aCompanies[ncompany])
ENDFOR
and now it works.
Now I asked myself, why in your code it does not work well - was it just a typing error, or is there another reason?

Thanks
Klaus



Peace worldwide - it starts here...
 
Just a typo. That's what happens when I write code in a forum and don't test it.

Tamar
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top