Identifying Appropriate Classes

Our first challenge in object modeling is to determine what classes we're going to need as our system building blocks. Unfortunately, the process of class identification is rather "fuzzy"; it relies heavily on intuition, prior modeling experience, and familiarity with the subject area, or domain, of the system to be developed. So, how does an object-modeling novice ever get started? One tried and true (but somewhat tedious) procedure for identifying candidate classes is to use the "hunt and gather" method: that is, to hunt for and gather a list of all nouns/noun phrases from the project documentation set and to then use a process of elimination to whittle this list down into a set of appropriate classes.

In the case of the SRS, our documentation set thus far consists of the following:

The requirements specification
The use case model that we prepared in Chapter 9

Noun Phrase Analysis

Let's perform noun phrase analysis on the SRS requirements specification first, which was originally presented in the Introduction, a copy of which is provided in the following sidebar. We've highlighted all noun phrases.

We have been asked to develop an automated Student Registration System (SRS) for the university. This system will enable students to register online for courses each semester, as well as track their progress toward completion of their degree.

When a student first enrolls at the university, he/she uses the SRS to set forth a plan of study as to which courses he/she plans on taking to satisfy a particular degree program, and chooses a faculty advisor. The SRS will verify whether or not the proposed plan of study satisfies the requirements of the degree that the student is seeking.

Once a plan of study has been established, then, during the registration period preceding each semester, students are able to view the schedule of classes online, and choose whichever classes they wish to attend, indicating the preferred section (day of the week and time of day) if the class is offered by more than one professor. The SRS will verify whether or not the student has satisfied the necessary prerequisites for each requested course by referring to the student's online transcript of courses completed and grades received (the student may review his/her transcript online at any time).

Assuming that (a) the prerequisites for the requested course(s) are satisfied, (b) the course(s) meet(s) one of the student's plan of study requirements, and (c) there is room available in each of the class(es), the student is enrolled in the class(es).

If (a) and (b) are satisfied, but (c) is not, the student is placed on a first-come, first-served wait list. If a class/section that he/she was previously waitlisted for becomes available (either because some other student has dropped the class or because the seating capacity for the class has been increased), the student is automatically enrolled in the waitlisted class, and an email message to that effect is sent to the student. It is his/her responsibility to drop the class if it is no longer desired; otherwise, he/she will be billed for the course.

Students may drop a class up to the end of the first week of the semester in which the class is being taught.

A simple spreadsheet serves as an ideal tool for recording our initial findings; just enter noun phrases as a single-column list in the order in which they occur in the specification. Don't worry about trying to eliminate duplicates or consolidating synonyms just yet; we'll do that in a moment. The resultant spreadsheet is shown in part in Figure 10-1.

Figure 10-1: Noun phrases found in the SRS specification

We're working with a very concise requirements specification (approximately 350 words in length), and yet this process is already proving to be very tedious! It would be impossible to carry out an exhaustive noun phrase analysis for anything but a trivially simple specification. If you're faced with a voluminous requirements specification, start by writing an "executive summary" of no more than a few pages to paraphrase the system's mission, and then use your summary version of the specification as the starting point for your noun survey. Paraphrasing a specification in this fashion provides the added benefit of ensuring that you have read through the system requirements and understand the "big picture." Of course, you'll need to review your summary narrative with your customers/users to ensure that you've accurately captured all key points.

After you've typed all of the nouns/noun phrases into the spreadsheet, sort the spreadsheet and eliminate duplicates; this includes eliminating plural forms of singular terms (e.g., eliminate "students" in favor of "student"). We want all of our class names to be singular in the final analysis, so if any plural forms remain in the list after eliminating duplicates (e.g., "prerequisites"), make these singular, as well. In so doing, our SRS list shrinks to 38 items in length, as shown in Figure 10-2.

Figure 10-2: Removing duplicates streamlines the noun phrase list.

Remember, we're trying to identify both physical and conceptual objects: as stated in Chapter 3, "something mental or physical toward which thought, feeling, or action is directed." Let's now make another pass to eliminate the following:

References to the system itself ("automated Student Registration System," "SRS," "system").
References to the university. Because we're building the SRS within the context of a single university, the university in some senses "sits outside" and "surrounds" the SRS; we don't need to manipulate information about the university within the SRS, and so we may eliminate the term "university" from our candidate class list.

Note, however, that if we were building a system that needed to span multiple universities—say, a system that compared graduate programs of study in information technology across the top 100 universities in the country—then we would indeed need to model each university as a separate object, in which case we'd keep "university" on our candidate class list.
Other miscellaneous terms that don't seem to fit the definition of an object are "completion," "end," "progress," "responsibility," "registration period," and "requirements of the degree." Admittedly, some of these are debatable, particularly the last two; to play it safe, you may wish to create a list of rejected terms to be revisited later on in the modeling life cycle.

The list shrinks to 27 items as a result, as shown in Figure 10-3—it's starting to get manageable now!

Figure 10-3: Further streamlining the SRS noun phrase list

The next pass is a bit trickier. We need to group apparent synonyms, to choose the one designation from among each group of synonyms that is best suited to serve as a class name. Having a subject matter expert on your modeling team is important for this step, because determining the subtle shades of meaning of some of these terms so as to group them properly isn't always easy.

We've grouped together terms that seem to be synonyms in Figure 10-4, bolding the term in each synonym group that we're inclined to choose above the rest; italicized words represent those terms for which no synonyms have been identified.

Figure 10-4: Grouping synonyms

Let's now review the rationale for our choices.

We choose the shorter form of equivalent expressions whenever possible— "degree" instead of "degree program" and "plan of study" instead of "plan of study requirements"—to make our model more concise.

Although they aren't synonyms as such, the notion of a transcript implies a record of "courses completed" and "grades received," so we'll opt to drop the latter two noun phrases for now.

When choosing candidate class names, we should avoid choosing nouns that imply roles between objects. As you learned in Chapter 5, a role is something that an object belonging to class A possesses by virtue of its relationship to/association with an object belonging to class B. For example, a professor holds the role of "faculty advisor" when that professor is associated with a student via an advises association. Even if a professor were to lose all of his or her advisees, thus losing the role of faculty advisor, he or she would still be a professor by virtue of being employed by the university—it's inherent in the person's nature relative to the SRS.

Note?/td>

If a professor were to lose his or her job with the university, one might argue that he or she is no longer a professor; but then, this person would have no dealings with the SRS, either, so it's a moot point.

For this reason, we prefer "Professor" to "Faculty Advisor" as a candidate class name, but make a mental note to ourselves that faculty advisor would make a good potential association when we get to considering such things later on.

Regarding the notion of a course, we see that we've collected numerous noun phrases that all refer to a course in one form or another: "class," "course," "preferred section," "requested course," "section," "prerequisite," "waitlisted class," "class that they were previously waitlisted for," "section that they were previously waitlisted for." Within this grouping, several roles are implied:

"Waitlisted class" in its several different forms implies a role in an association between a Student and a Course.
"Prerequisite" implies a role in an association between two Courses.
"Requested course" implies a role in an association between a Student and a Course.
"Preferred section" implies a role in an association between a Student and a Course.

Eliminating all of these role designations, we're left with only three terms: "class," "course," and "section." Before we hastily eliminate all but one of these as synonyms, let's think carefully about what real-world concepts we're trying to represent.

The notion that we typically associate with the term "course" is that of a semester-long series of lectures, assignments, exams, etc., that all relate to a particular subject area, and which are a unit of education toward earning a degree. For example, Beginning Math is a course.
The terms "class" and "section," on the other hand, generally refer to the offering of a particular course in a given semester on a given day of the week and at a given time of day. For example, the course Math 101 is being offered this coming Spring semester as three classes/sections:
Section 1, which meets Tuesdays from 4 to 6 p.m.
Section 2, which meets Wednesdays from 6 to 8 p.m.
Section 3, which meets Thursdays from 3 to 5 p.m.

There is thus a one-to-many association between Course and Class/Section. The same course is offered potentially many times in a given semester and over many semesters during the "lifetime" of the course.

Therefore, "course" and "class/section" truly represent different abstractions, and we'll keep both concepts in our candidate class list. Since "class" and "section" appear to be synonyms, however, we need to choose one term and discard the other. Our initial inclination would be to keep "class" and discard "section," but in order to avoid confusion when referring to a class named Class (!) we'll opt for "section" instead.

Refining the Candidate Class List

A list of candidate classes has begun to emerge from the fog! Here is our remaining "short list" (please disregard the trailing symbols [*, +] for the moment—we'll explain their significance shortly):

Course
Day of week*
Degree*
Email message+
Plan of study
Professor
Room*
Schedule of classes+
Seating capacity*
Section
Semester*
Student
Time of day*
Transcript
(First-come, first-served) Wait list

Not all of these will necessarily survive to the final model, however, as we're going to scrutinize each one very closely before deeming it worthy of implementation as a class. One classic test for determining whether or not an item can stand on its own as a class is to ask these questions:

Can we think of any attributes for this class?
Can we think of any services that would be expected of objects belonging to this class?

One example is the term "room": we could invent a Room class as follows:

public class Room {
  // Attributes.
  int roomNo;
  string building;
  int seatingCapacity;
  // etc.
}

or we could simply represent a room location as a string attribute of the Section class:

public class Section {
  // Attributes.
  Course offeringOf;
  string semester;
  char dayOfWeek; // 'M', 'T', 'W', 'R', 'F'
  string timeOfDay;
  string classroomLocation; // building name and room name: e.g.,
                            // "Government Hall Room 105"
  // etc.
}

Which approach to representing a room is preferred? It all depends on whether or not a room needs to be a focal point of our application. If the SRS were meant to also do "double duty" as a Classroom Scheduling System, then we may indeed wish to instantiate Room objects so as to be able to ask them to perform such services as printing out their weekly usage schedules or telling us their seating capacities. However, since these services weren't mentioned as requirements in the SRS specification, we'll opt for making a room designation a simple string attribute of the Section class. We reserve the right, however, to change our minds about this later on; it's not unusual for some items to "flip flop" over the life cycle of a modeling exercise between being classes on their own versus being represented as simple attributes of other classes.

Following a similar train of thought for all of the items marked with an asterisk (*) in the preceding candidate class list, we'll opt to treat them all as attributes rather than making them classes of their own:

"Day of week" will be incorporated as either a string or char attribute of the Section class.
"Degree" will be incorporated as a string attribute of the Student class.
"Seating capacity" will be incorporated as an int attribute of the Section class.
"Semester" will be incorporated as a string attribute of the Section class.
"Time of day" will be incorporated as a string attribute of the Section class.

When we're first modeling an application, we want to focus exclusively on functional requirements at the exclusion of technical requirements, as defined in Chapter 9; this means that we need to avoid getting into the technical details of how the system is going to function behind the scenes. Ideally, we want to focus solely on what are known as domain classes—that is, abstractions that an end user will recognize, and which represent "real-world" entities—and to avoid introducing any extra classes that are used solely as behind-the-scenes "scaffolding" to hold the application together, known alternatively as implementation classes or solution space classes. Examples of the latter would be the creation of a collection object to organize and maintain references to all of the Professor objects in the system, or the use of a dictionary to provide a way to quickly find a particular Student object based on the associated student ID number. We'll talk more about solution space objects in Part Three of the book; for the time being, the items flagged with a plus sign (+) in the candidate class list earlier—"email message", "schedule of classes"— seem arguably more like implementation classes than domain classes.

An email message is typically a transient piece of data, not unlike a popup message that appears on the screen while using an application: it gets sent out of the SRS system, and after it's read by the recipient, we have no control over whether the email is retained or deleted. It's unlikely that the SRS is going to archive copies of all email messages that have been sent— there certainly was no requirement to do so—so we won't worry about modeling them as objects at this stage in our analysis.

Email messages will resurface in Chapter 11, when we talk about the behaviors of the SRS application, because sending an email message is definitely an important behavior; but, emails don't constitute an important structural piece of the application, so we don't want to introduce a class for them at this stage in the modeling process. When we actually get to programming the system, we might indeed create an EmailMessage class in C#, but it needn't be modeled as a domain class. (If, on the other hand, we were modeling an email messaging system in anticipation of building one, then EmailMessage would indeed be a key domain class in our model.)
We could go either way with the schedule of classes—include it as a candidate class, or drop it from our list. The schedule of classes, as a single object, may not be something that the user will manipulate directly, but there will be some notion behind the scenes of a schedule of classes collection controlling which Section objects should be presented to the user as a GUI pick list when he or she registers in a given semester. We'll omit ScheduleOfClasses from our candidate class list for now, but can certainly revisit our decision as the model evolves.

Determining whether or not a class constitutes a domain class instead of an implementation class is admittedly a gray area, and either of the preceding candidate class "rejects" could be successfully argued into or out of the list of core domain classes for the SRS. In fact, this entire exercise of identifying classes hopefully illustrates a concept that was first introduced in Chapter 2; because of its importance, we'll repeat it again in the following sidebar.

… Developing an appropriate model for a software system is perhaps the most difficult aspect of software engineering, because:

There are an unlimited number of possibilities. Abstraction is to a certain extent in the eye of the beholder: several different observers working independently are almost guaranteed to arrive at different models. Whose is the best? Passionate arguments have ensued!

To further complicate matters, there is virtually never only one "best" or "correct" model, only "better" or "worse" models relative to the problem to be solved. The same situation can be modeled in a variety of different, equally valid ways… .

… There is no "acid test" to determine if a model has adequately captured all of a user's requirements.

As we continue along with our SRS modeling exercise, and particularly as we move from modeling to implementation in Part Three of the book, we'll have many opportunities to rethink the decisions that we've made here. The key point to remember is that the model isn't "cast in stone" until we actually begin programming, and even then, if we've used objects wisely, the model can be fairly painlessly modified to handle most new requirements. Think of a model as being formed out of modeling clay: we'll continue to reshape it over the course of the analysis and design phases of our project until we're satisfied with the result.

Meanwhile, back to the task of coming up with a list of candidate classes for the SRS. The terms that have survived our latest round of scrutiny are as follows:

Course
PlanOfStudy
Professor
Section
Student
Transcript
WaitList

Let's examine WaitList one last time. There is indeed a requirement for the SRS to maintain a student's position on a first-come, first-served wait list. But, it turns out that this requirement can actually be handled through a combination of an association between the Student and Section classes, plus something known as an association class, which you'll learn about later in this chapter. This would not be immediately obvious to a beginning modeler, and so we'd fully expect that the WaitList class might make the final cut as a suggested SRS class. But, we're going to assume that we have an experienced object modeler on the team, who convinces us to eliminate the class; we'll see that this was a suitable move when we complete the SRS class diagram at the end of the chapter.

So, we'll settle on the following list of classes, based on our noun phrase analysis of the SRS specification:

Course
PlanOfStudy
Professor
Section
Student
Transcript

Revisiting the Use Cases

One more thing that we need to do before we deem our class list good to go is to revisit our use cases—in particular, the actors—to see if any of them ought to be added as classes. You may recall that we identified seven potential actors for the SRS in Chapter 9:

Student
Faculty
Department Chair
Registrar
Billing System
Admissions System
Classroom Scheduling System

Do any of these deserve to be modeled as classes in the SRS? Here's how to make that determination: if any user associated with any actor type A is going to need to manipulate (access or modify) information concerning an actor type B when A is logged onto the SRS, then B needs to be included as a class in our model. This is best illustrated with a few examples.

When a student logs onto the SRS, might he or she need to manipulate information about faculty? Yes; when a student selects an advisor, for example, he or she might need to view information about a variety of faculty members in order to choose an appropriate advisor. So, the Faculty actor role must be represented as a class in the SRS; indeed, we have already designated a Professor class, so we're covered there. But, student users are not concerned with department chairs per se.
Following the same logic, we'd need to represent the Student actor role as a class because when professors log onto the SRS, they will be manipulating Student objects when printing out a course roster or assigning grades to students, for example. Since Student already appears in our candidate class list, we're covered there, as well.
When any of the actors—Faculty, Students, the Registrar, the Billing System, the Admissions System, or the Classroom Scheduling System—access the SRS, will there be a need for any of them to manipulate information about the registrar? No, at least not according to the SRS requirements that we've seen so far. Therefore, we needn't model the Registrar actor role as a class.
The same holds true for the Billing, Admissions, and Classroom Scheduling Systems: they require "behind the scenes" access to information managed by the SRS, but nobody logging on to the SRS expects to be able to manipulate any of these three systems directly, so they needn't be represented by domain classes in the SRS.

Note?/td>

Again, when we get to implementing the SRS in code, we may indeed find it appropriate to create "solution space" C# classes to represent interfaces to these other automated systems; but, such classes don't belong in a domain model of the SRS.

Therefore, our proposed candidate class list remains unchanged after revisiting all actor roles:

Course
PlanOfStudy
Professor
Section
Student
Transcript

Is this a "perfect" list? No—there is no such thing! In fact, before all is said and done, the list may—and in fact probably will—evolve in the following ways:

We may add classes later on: terms we eliminated from the specification, or terms that don't even appear in the specification, but which we'll unearth through continued investigation.
We may see an opportunity to generalize—that is, we may see enough commonality between two or more classes' respective attributes, methods, or relationships with other classes to warrant the creation of a common base class.
In addition, as we mentioned earlier, we may rethink our decisions regarding representing some concepts as simple attributes (semester, room, etc.) instead of as full-blown classes, and vice versa.

The development of a candidate class list is, as we've tried to illustrate, fraught with uncertainty. For this reason, it's important to have someone experienced with object modeling available to your team when embarking on your first object modeling effort. Most experienced modelers don't use the rote method of noun phrase analysis to derive a candidate class list; such folks can pretty much review a specification and directly pick out significant classes, in the same way that a professional jeweler can easily choose a genuine diamond from among a pile of fake gemstones. Nevertheless, what does "significant" really mean? That's where the "fuzziness" comes in! It's impossible to define precisely what makes one concept significant and another less so. We've tried to illustrate some rules of thumb by working through the SRS example, but you ultimately need a qualified mentor to guide you until you develop—and trust—your own intuitive sense for such things.

The bottom line, however, is that even expert modelers can't really confirm the appropriateness of a given candidate class until they see its proposed use in the full context of a class diagram that also reflects associations, attributes, and methods, which we'll explore later in this chapter as well as in Chapter 11.