Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations IamaSherpa on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Modeling a data warehouse for recipes

Status
Not open for further replies.

Olaf Doschke

Programmer
Oct 13, 2004
14,847
DE
Hi all,

I'm quite a novice to data warehousing.

How would you model a data warehouse for searching recipes? Searches will be made with
a) giving some relative amount of ingredients that have to be in a recipe recipes within that range are searched (Search for recipes having 10-20% apples and 20-40% sugar).
b) giving a full recipe the "most similar" recipes are serached (What recipe is most similar to "apple pie"?).

Recipes can have a varying number of ingredients out of thousands of raw materials and so a recipe could be represented as a N-dimensional point in the space of rawmaterials, where each rawmaterial will be a dimension (N in the range of 10.000!)and each value is going from 0% to 100%.

But most recipes of course have only 10-20 rawmaterials, being 0% in all other dimensions (ramaterials). So shouldn't it be possible to store this in several cubes with lower dimension.

Although a) could also be done on production data the amount of data makes this quite slow. There are three tables involved, main table is recipes, then there are subrecipes and these have ingredients, of whom some may be recipes or subrecipes. This of course can be denormalized to recipes with ingredients of a certain amount. I think that would be a good first step of course.

But then?
Building some clusters of recipes with similar ingredients?
Perhaps building a one-dimensional cube for each ingredient (would be equivalent to a simple index)?
I could build clusters per type of recipe, as those should be more similar to each other, but then you don't know...

How yould you model this?

Bye, Olaf.
 
Is the purpose of the cube to analyze recipes or to group recipes by ingredients? Or what? What is the primary purpose of the database? For querying? Will recipes be changed as time passes or are they constant?

How specific are the ingredients? Is an apple an apple, or can it be a MacIntosh or Granny Smith? In that case, Ingredients have a "family" of their own.

Just some food for thought (sorry about the pun). Anyway, what's the purpose of the database?

-------------------------
The trouble with doing something right the first time is that noboby appreciates how difficult it was.
- Steven Wright
 
Hi John Herman,

Well, there are several purposes, but the goal is to predict properties of new recipes. Those recipes are really cosmetical products. There are production recipes and those recipes that led to those final recipes while developing a product. So there are very much recipes.

Those recipes are not changed in data, but modifications are written as a new recipe. Not always with a backlink to the historic predecessor of that recipe. There also may be several subrecipes put together to a new recipe. It has some quite complicated referencing, but mainly the data is only appended, not changed.

When talking about apples and sugar, that was of course just for giving a simple example. Yes, there are types of ingredients or rawmaterials and I can determine that type, although the recipes or formulas have the very special rawmaterial which was used, comparable to "granny smith" or "golden delicous". But these families of rawmaterials overlap, for example the chemical function of a rawmaterial could be an emulgator in one recipe, but a color in another and I can't determine what's the chemical function inside a certain formula/recipe. I only can determine the primary chemical function.

Well, I've read a little bit about classification hierarchies and a multidimensional model, a facts table which references several dimension tables in a star model etc. But as it is with complex new things, I can't really put those parts together, get no grip, no good feeling.

Bye, Olaf.

 
The data model like this should be something similar to a facility maintenance tracking system. In that system, there are component parts (for you, base ingredients). Also, there is a concept called an assembly (a collection of component parts, for you, a subrecipe). A combination of components and assemblies creates a machine, or product. Each ingredient or subrecipe would acquire certain properties dependent on its finished recipe (color versus emulgator in your example).

Recipes can belong to a Recipe Family, a concept intended to keep track of related and prior versions of current products.

Digest this (sorry about the pun again) and let's discuss further if you wish.



-------------------------
The trouble with doing something right the first time is that noboby appreciates how difficult it was.
- Steven Wright
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top