Archive with quality control 2

maiago · Dec 10, 2005

I have an installation of DSpace, a MIT open-source software that manages self-archiving. I need to implement a subsystem on it in order to introduce quality control. The requirements of the subsystem are explained in the paper

http://www.dimi.uniud.it/~mizzaro/research/papers/EJ-JASIST.pdf.

I have more than one question.
1) where is the separation between analysis and design? Are there some fixed steps in order to transform an analysis class diagram into a design class diagram?
2) scores are assigned to users, authors and papers. Users of the archive can express judgements on paper. After each judgement (an only in this moment) the subsystem have to update some scores (of the paper, the reader, the author of the paper and the other readers who voted the paper) on the basis of some updating formula. What is the best way for design this situation, if we consider that formulas could change in the future? During the updating of some entities the subsystem needs also old and updated scores of other entities. How manage this situation? I would need a way to make a photo of the entities before updating.
3) score's structure can change. I have to simplify, for instance, the introduction of a new score parameter. I have also to consider that new parameters cause changes on updating process. I need to link in some way the score with its updater?

Sorry for my poor english, and thanks

grooke · Dec 12, 2005

A good one!

I have drawn a Domain (class) diagram and a use case diagram. Describing complex diagrams in text is difficult. Getting this solution right is another problem; I would normally rework it all a few times and add sequence diagrams to make sure it works. But here goes:

In the domain diagram, there is a 'Subscriber' who can have 1 or 2 'Subscriber Roles'. Two classes linked by a composition association; 1 -> 1..2.

The 'Subscriber Role' has two derived classes, 'Reader' and 'Author'.

Now we can turn to the use case diagram. There are four use cases performed by three actors. The actors are the roles:
The 'Subscriber Role' has two use cases: 'Subscribe' and 'Update Subscription'. The latter allows the subscriber to adopt a second role.
The 'Author' can 'Publish'
The 'Reader' can 'Review'
These will lead to four sequence diagrams in the end.

Now return to the domain diagram. I have a 'Literary Object' class that has two attributes (assuming simplest case): 'Score' and 'Steadiness'.

The 'Literary Object' has two derived classes: 'Subscriber Role' and 'Article'. This inheritence is NOT neccessary and may even be wrong, but it just allows me to have the attributes 'Score' and 'Steadiness' in one place, so that I can have many pairs for different scores in the future (readability, interest, accuracy etc).

This leaves me with three types of Literary Objects each of which has a pair of attributes: 'score' and a 'steadiness'.

Now. Many readers may review each article and each reader may review many articles, so there is a many to many relationship here. This will probably require a link class, because for future reworks of values, it may be good to remeber what value the reader gave the article or what value the reader had themselves when they reviewed the article or both. Remembering the scores given (raw data) also leaves you in a better position to change the scoring algorithms at a later date.

The 'Reader-Article' link class needs an operation to 'Score Article()' so that the reader can pass down the reference to the article and the score(s). This creates an instance of the class with the joint key of Reader and Article and the value assigned by the reader.

This operation also needs to call an operation in the 'Article' which will 'Score Article()'.

As well as calculating the Article's score, it also goes on to 'Score Author()' via the 'Article-Author' link class in much the same way.

The 'Article:Score Article()' operation will also 'Score Reader()'. This returns through the 'Reader-Article' link class and may well store the 'goodness' of the review, also as raw data for further calculations. This will itself update the Score of the current 'Reader'.

Then comes the task of updating 'Previous Readers'. Depending on the formula, most of the necessary data should be in the 'Reader' class and the 'Reader-Article' link class.

Now, your 'scores' can be split into many fairly easily. If you want to treat say 'Articles' differently, then put scores etc that only belong to a specific type of literary object, in its specific class.

Similarly, your algorithms should all be specifically in an appropriate class, and you can just rewrite them as required. Of course, if you add more scores to update, then you will need to add these to the calculations.

This is roughly how I would approach it. Clearly, I may have put some 'sillies' in here; trying to save time. But I hope you get the picture. This is one case where drawing the use cases fairly carefully as sequence diagrams will pay handsome dividends, as it will find lots of silly things I (you) have done. Dont worry. Thats what they are there for.

Now for your questions:
1) I think design starts when you add one contoller, user interface or helper class to those in the domain model. I would continue expanding the domain model until its structure starts to get confused by the additional classes, and then maybe start to keep two diagrams. Another approach is to just add these other classes in sub-diagrams that just show some aspect of the domain. If you can manage without confusion, then dont split; it gives you tracability. I know other have different rules for this, so take different advice (particularly from real programmers and NOT just analysts).
2) I think I have covered that. Given that the three 'Literary Objects' have all the current weighted averages, and the link classes have all the raw data, you can make 'snap-shots' (Photos) whenever you want. You may find another use case or two to do this specifically.
3) I think I have covered this.

Will some of you other guys, please correct any glaring errors, because I have written this more or less as I worked it out; well a quick second draft. I have even run out of time to proof read it.

I hope this helps.

Gil

grooke · Dec 13, 2005

Woops. Some 'Sillies'

The 'Article:Score Article()' operation will also 'Score Reader()'. This returns through the 'Reader-Article' link class

should read
"The 'Article:Score Article()' operation returns through the 'Reader-Article' link class......"

Now, your 'scores' can be split into many fairly easily.

should read
"Now, your 'scores' can be split into many different scores fairly easily."

The last 'silly'. I overran my parking time last night and the car park was locked. Cost me £32 in taxies :-{{

Anyway, I hope you can follow the abreviated story.
I just notice that I can post diagrams. I might find time to do this tomorrow.

Gil

maiago · Dec 13, 2005

Thanks Gil.
I'm astonished for the quality and the speed of your response... I'm not an expert, and so my judgement could not be so important but I loss days on this problem without getting a so functional solution. My problem, I think, was my typical object-oriented newbies approach: I look for some class like "System" or "Updater" (...or both) which done more or less everything, perhaps with some subcomponents (...in order to give a reason to my OBJECT ORIENTED design!). I was not able to find different design such like your.
I tried to study DSpace source code, in order to take some good tip but it usual put me in the wrong way. But it raises me two another question:

1) "Readers" of the quality control subsystem are effectively Subscribers of DSpace and they are modeled through a very simple class "EPerson". A "Reader" class will be associated to each "EPeople" class. In the same way an "Item" represent an "Article". For "Author" the situation is different: subscribers ("EPerson"), and only them, can submit (...publish) papers. But they don't need to be authors of these papers; in order to follow the OAI protocol authors are threatened as metadata (a string...) of the article (like date, contributor, type, etc...). So I will need to istantiate an "Author" class, and so create a new author entry in my database, during a submission process. I imagine something like a call "Author.create(Name, Surname):boolean" , that look by name/surname in the author's table and create a new entry if don't find it. All obviously putted in the right DSpace class.
Your idea works in the same way? I refer to the "Update Subscription" use case and to "Author"-specialize-"Subscriber role" relationship. Perhaps actors will be Subscriber Role, Reader and Submitter. Perhaps "Submitter role" a subclass of "Subscriber", in a 1-0..* relation with "Article", with a link class named "Author". I will study the problem more deeply, I read your paper just one and half hour ago, so this question is jet a me-to-me question (...ah my English).
2) I will need to modify DSpace code, like in the previous question in order to create an author, or when the web interface ask the subsystem to register a "Judgement" (your link class Reader-Article, I think). The question is: in order to separate better DSpace from the quality control subsystem, is a good idea to create one class containing factory methods that instantiate object of "Reader", "Author", "Article", ecc... classes?
And a last one:
3) The better way to separate updating algorithms could be this: three parent classes "Article Updater", "Reader Updater", "Author Updater". The first two associated to the Reader-Author link class ("Judgement") the last one associated with Author(or Submitter)-Article link class, each one specialized in a concrete Updater like in the strategy design pattern?

OK. I will study better the situation, and soon as I can I will post something on my progress, some UML schemas I think. I hope I don't ask too much, but I would be delighted
if you'll give me your opinion.

Thanks for your help. Pay... attention with your car. Bye

maiago · Dec 13, 2005

...sorry.
In the first question:

...A "Reader" class will be associated to each "EPeople"
is wrong, I mean:
...A "Reader" class will be associated to each "EPerson"

...Your idea works in the same way?
I mean:
...Your idea works also in this situation?

bye

BobRodes · Dec 14, 2005

<I overran my parking time last night and the car park was locked. Cost me £32 in taxies.

Gotta hate that big city life...Gil gets another star for his dedication.

maiago, feel free to post any questions you like. Also, if you want a comprehensive book on UML, that goes into some detail on some of the questions you're asking, check "The UML Bible" by Tom Pender (Wiley publishing, 2004).

HTH

Bob

grooke · Dec 15, 2005

Always look at the outline description or major objectives of the project. When I went to your reference, most of my classes were described in the following few lines.

Each paper has some scores, measuring its quality (accuracy,comprehensibility, novelty, and so on). This score is .....later dynamically updated on the basis of the readers’judgments. A subscriber to the journal is an author or a reader (or both). Each subscriber has a score too, .....later updated on the basis of the activity of the subscriber (if the subscriber is both an author and a reader, she has two different scores, one as an author and one as a reader).

Even when your reverse engineering or interfacing to an existing package, start with what YOU want to achieve. Maybe 'DSPace' is not OO structured or maybe it is NOT the shape you want.
Decide what you want.
Look at what DSpace can offer.
Build a 'Wrapper' or 'Facade between the two, so that your system makes sense for your problem and the facade makes allowances for the differences.

'System' is a very special class, which you should only think about when implementing 'non-functional requirements'.
'Updater' doesn't carry a persistant state of your system (objects carry persistant state); it is almost a verb 'update' changed to provide a class that lets something perform this vauge function.
'Look for things you can 'kick'; gramatically 'concrete nouns'. readers, subscribers, scores, articles, authors etc.

Incidently, the concept of having scores as attributes is another 'silly'. If there is a class of 'store', it could have attributes of 'value', steadiness' etc AND it could have an operation that says how to operate that type of score. This would change the class structure a bit and let you swap out some simple inheritance and replace it with 'interfaces'. But that is another story.

However, it gets you away from the horrible noun 'Updater'. How many 'Updaters' have you shaken hands with or kicked. 'Scores' are not kickable, but they are real life objects. I think the idea is the same though. If you decide that articles have 6 different scores, you make a 'list' or collection of 6 scores and add in 6 different sub-classes of store, which automatically give you the updating rules in their particular operations. Then you dont really care how they store their data; they can use terms like 'value' and 'steadiness' or they can use two different 'values' and a 'standard deviation' and a 'number of updates made' variable or whatever.

I'm learning!

Gil

BobRodes · Dec 15, 2005

This is all very good advice. A lot of the same applies when you are creating a database structure, too. "Things you can kick" is a fine way of describing "kernel entities", for example. Nice piece of work, Gil.

I'm afraid I can't resist observing that I would be more likely to kick an updater than shake hands with one, curmudgeon that I am...

Bob

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Archive with quality control 2

maiago

Programmer

grooke

MIS

grooke

MIS

maiago

Programmer

maiago

Programmer

BobRodes

Instructor

grooke

MIS

BobRodes

Instructor

Similar threads

Part and Inventory Search

Sponsor