January 16, 2008

Baseball cards as objects - The rest of the checklist

Yesterday I started creating an object model for baseball cards. I started with just the 1968 Topps set, going through the checklist to see how the model holds up. Each time it failed I made a change to the model to fix it.

The first mistake I made was not defining a use case for the baseball card objects. I think it was subtly implied, but as we all know, bugs feed on implications. So, at Kevin and Bob's suggestion, I'll actually define the use case I had in my mind as I worked on yesterday's entry. I'll use a simplified template for my use cases, more of a "user story" type format.

The use case

Name: Add baseball cards from 1968 Topps set
Goal: Allow the user to add baseball card definitions to a database of the 1968 Topps set.
Story: The user would like to add a card to the database for the 1968 Topps baseball card set. If the card is a duplicate it notifies the user and allows him to overwrite the existing card. Alternatively, if the duplicate is a variation on the existing card, the user can describe the variation which will show up as a different card on any generated reports.

Just like in any project, a use case or story sometimes needs fine tuning. I appreciate any comments or suggestions on the above use case. As this series of entries continues, there will be higher level stories introduced. Right now I have an overarching vision of what I want to do with my baseball card database, but I'm going to artificially limit the stories until later.

Bob blogged about my last entry and brought up a couple of other interesting points. First of all, he's mirroring the code in Ruby. Hopefully when I end up sticking a database behind the code, he'll show how to do it with a Rails application.

Bob's first question is related to the use case. He asks whether I will be searching through stats, or just searching for cards. Initially, I just want to search for cards. If stats were the goal, it would probably be easier to link the players to their entry at baseball-reference.com. Having said that, there are possible future use cases for wanting to look up some meta-statistics related to the text on the cards. I'll touch on some of this later when I talk about card misprints and their representation. So, the simple answer right now is that I want to be able to identify each card in the set uniquely so we can ignore any statistical information for now.

The second thing Bob did in his Ruby implementation was subclass the Batting Leaders card. I would be the "Domain Expert" here, and I can see in the future that subclassing each card type based on its subject matter might not be the best approach. It's probably a too narrow approach. For instance, in the 1968 set there are only two "Batting Leaders" cards, one for the National League and one for the American League. There are ten total "Leaders" type cards that follow the same template as the "Batting Leaders" cards (e.g., "RBI Leaders", "Home Run Leaders", etc.). As we go down the checklist further we'll find that there are other templates to consider, and I think they all may fit into the object model we ended with yesterday, but we'll go through that in a sec.

One last thing that I notice that Bob did was include the year as a field within the card class. I was going to go over this when we were satisfied that our model would accommodate the 1968 set, and then move on to another set. So, I'm going to defer considering this and other things like the manufacturer's name until later.

Besides the use case, another thing that I haven't done is write test cases for the code, or use JUnit Factory to generate tests for me. Once I get through the 1968 checklist, I'll generate the tests and go over the results.

We've made it through the first ten cards, and found a few problems with our original assertions about the model for a baseball card. Let's go through the rest of the checklist and see if we can spot anything else that doesn't fit our current model.

The next card type that looks slightly different than what we've seen before is card #16, the Cleveland Indians Rookies card that has Lou Piniella and Richie Scheinblum. It resembles the "Leaders" type cards we discussed last time. On the front are two players. At the top of the card is a circle with their team name in it, and the text "1968 Rookie Stars". On the back are the card number, some text that says "1968 Indians Rookie Stars", then some trivia text about each player and their stat line for the year. So, the "Rookie" cards fit fine in the model we've currently constructed. We can attach each of the players to the card object and use the description field to hold the "1968 Rookie Stars" text.

1968_16.jpg

We get all the way to card #67 before we see a new card type, a checklist. In 1968, Topps released the set in seven different series. Each series had a checklist of the players contained in it. The front of the checklist has "Topps Baseball" across the top. There's a picture of a selected player in a circle under that, and next to the picture is text identifying which series it's a checklist for. The rest of the card is covered with the names of all the players in that series with checkboxes next to them.

1968_67_front.jpg

Again, this seems to fit in the model. We can associate the player in the picture with the card, and use the text describing the series this is a checklist for in the description. If we wanted to, we could also associate all of the players on the checklist to the card.

We're about 1/10th of the way through the set, and we're feeling pretty comfortable with the minimal object model we've set up for our 1968 Topps set. There are a few more different card types coming up, and I want to talk about the trade-offs involved with using the description field as a catch-all for the different types we're encountering.

This entry is getting pretty long, so I'm going to stop here for now. Tomorrow, I'll talk about testing the code we've got, and try to get through some more of the checklist.

Comments closed due to comment spam
Posted by Rob at 08:37 PM | Comments (2)

January 15, 2008

Baseball cards as objects - from players to leaders

So, say you wanted to hold your baseball card collection in a database. What would the object model look like? It's one of those problems that sounds simple originally, but gets more complex as you add some of the intricacies and outside cases that have been introduced into baseball cards through the years, either by mistake or by design.

If your collection consisted only of cards between 1952 and 1980, it would make the problem quite a bit easier. I'll make it even easier and pick one year to start -- 1968. I've already thought through a lot of this, but there are some interesting problems right off the bat [npi].

So, think of a standard baseball card. On the front, it has a picture of the player, his name, his position and his team. On the back, there's a card number, and it repeats his name, position and team, along with some personal information (height, weight, which side he bats/throws from, his birthdate, and where he lives). It also contains yearly statistics for his career. Sometimes it has a short biographical blurb, and ends with a piece of trivia in the form of a cartoon at the bottom of the card. Here's Frank Robinson's card as an example:

1968.jpg

Now, we don't need all of this information to keep track of all of the cards we own. Once we can identify a unique card, we can just go to the collection itself to see the interesting tidbits contained on it. Later on, we can add as much detail as we like. It seems like the minimal amount of information we need is the card number itself. We probably want to add some more identifying information to make any reports we generate more interesting, but the card number seems to be a unique identifier all by itself. So, let's take a first cut at our "BaseballCard" object. I'll use Java, just because it's the language I'm most fluent in.

public class BaseballCard {

  private int number;
  private String playerName;
  private String teamName;

  public int getNumber() {
    return number;
  }

  public void setNumber(int num) {
    number = num;
  }

  public String getPlayerName() {
    return playerName;
  }

  public void setPlayerName(String name) {
    playerName = name;
  }

  public String getTeamName() {
    return teamName;
  }

  public void setTeamName(String name) {
    teamName = name;
  }

}

Let's take a look at a list of the cards from 1968 and see if our assertion about the card number fails. There are a lot of sources for card set lists, and I'm just going to use the one here for now. Starting at the beginning, a number of problems become obvious within the first ten cards. First, using just the number on the back of the card won't work. You can see that card #10 has a variation, and there isn't actually a card #10. There are two variations, "10a" and "10b". Apparently, Topps misspelled Boston Red Sox pitcher Jim Lonborg's name incorrectly then corrected it in later printings. Many collectors will collect both versions of the card, so we need to make allowances for variations. Another problem shows up right at card #1. There are three players on the card, and it's identified as a card that shows the National League Batting Leaders. It has pictures of Roberto Clemente, Tony Gonzalez and Matty Alou, who all play for different teams. It also has text at the top that says "1967 Batting Leaders", and a blue circle that identifies it as "National League".

1968_1.jpg

The back of the card has the card number, and a list of the top 50 batting leaders for the 1967 season with their name and batting average. So let's try and modify our object model to include the problems we've just identified.

There are a couple of ways that we could modify the class to accommodate variations. We could change the type of the number field to a String, and just accept alphabetic characters. I'm going to start off by leaving that field alone, though. I'm going to add a field that identifies whether the card is a variation, then add another field for the description of the variation. I may change my mind later, but I want to see how this works.

public class BaseballCard {

  private int number;
  private boolean variation;
  private String variationDescription;
  private String playerName;
  private String teamName;

  public int getNumber() {
    return number;
  }

  public void setNumber(int num) {
    number = num;
  }

  public String getPlayerName() {
    return playerName;
  }

  public void setPlayerName(String name) {
    playerName = name;
  }

  public String getTeamName() {
    return teamName;
  }

  public void setTeamName(String name) {
    teamName = name;
  }

  public void setVariation(boolean b) {
    variation = b;
  }

  public boolean isVariation() {
    return variation;
  }

  public void setVariationDescription(String desc) {
    variationDescription = desc;
  }

  public String getVariationDescription() {
    return variationDescription;
  }
}

That takes care of allowing variations, but what about allowing more than one player to be attached to the card? It looks like we're going to have to create a Player class and keep a list of them attached to the card. It can hold any information we want to keep about the player and be re-used for different cards. Let's just create a very simple class to start off with.

public class Player {
  private String name;
  private String team;
  
  public String getName() {
    return name;
  }
  public void setName(String n) {
    name = n;
  }
  public String getTeam() {
    return team;
  }
  public void setTeam(String t) {
    team = t;
  }
}

Now we just need to modify our BaseballCard class to hold an ArrayList of players instead of one player's name and team name. We'll also add a method to get the list of players, and a method that will allow us to add players to the list.

import java.util.ArrayList;

public class BaseballCard {

  private int number;
  private boolean variation;
  private String variationDescription;
  private ArrayList<Player> players;

  public BaseballCard() {
    players = new ArrayList<Player>();
  }

  public int getNumber() {
    return number;
  }

  public void setNumber(int num) {
    number = num;
  }

  public void setVariation(boolean b) {
    variation = b;
  }

  public boolean isVariation() {
    return variation;
  }

  public void setVariationDescription(String desc) {
    variationDescription = desc;
  }

  public String getVariationDescription() {
    return variationDescription;
  }

  public ArrayList<Player> getPlayers() {
    return players;
  }
  
  public void addPlayer(Player player) {
    players.add(player);
  }

}

Finally, since we've found that cards can be about different subjects than just a single player, I'm going to add a generic description field for now. Later a better way to identify these insert cards might come to me.

import java.util.ArrayList;

public class BaseballCard {

  private int number;
  private String description;
  private boolean variation;
  private String variationDescription;
  private ArrayList<Player> players;

  public BaseballCard() {
    players = new ArrayList<Player>();
  }

  public int getNumber() {
    return number;
  }

  public void setNumber(int num) {
    number = num;
  }

  public void setDescription(String desc) {
    description = desc;
  }

  public String getDescription() {
    return description;
  }

  public void setVariation(boolean b) {
    variation = b;
  }

  public boolean isVariation() {
    return variation;
  }

  public void setVariationDescription(String desc) {
    variationDescription = desc;
  }

  public String getVariationDescription() {
    return variationDescription;
  }

  public ArrayList<Player> getPlayers() {
    return players;
  }
  
  public void addPlayer(Player player) {
    players.add(player);
  }

}

Now we should be able to create a set of cards where each of the cards can be attached to multiple players. Any of the cards can have a variation associated with it, and we can describe the variation in detail.

Next time we'll go further down the checklist and see how well our current object model supports the different card types.

Closed comments because of comment spam
Posted by Rob at 04:17 PM | Comments (1)