Fraudulent Digits and Bogus Numbers
Today’s
world is characterized by millions of electronic data transfers, financial
submissions, and other numerical sets. It is nothing to swipe a card in an ATM and
walk away with hundreds of dollars in cash or to electronically submit the
corporate earnings of thousands of companies in a flash. With such speed and
efficiency comes the risk of fraud. But how does one recognize legitimate
numbers from bogus ones? In 1993, in my home state of

So how was the prosecution able to prove that these checks were fraudulent? Welcome to the mathematical field of Digital Analysis. Digital analysis has become standard practice for CPA’s and financial investigators to detect fraudulent corporate data, income tax returns, and was used in validating the solutions to the anticipated Y2K problems. It is now beginning to find its way into other arenas of applications, such as, students cheating! Anyways, let’s explore some data!
In order to introduce you to an amazing
observation that first took place back in 1881, revisited again in 1938, yet
left unproven until 1996, we will examine some data.
Your Written
Submission Responsibilities
To make sure you have done all the
requirements of this project this is what you will need to submit to David by …… in a Word Document attachment.
Task1. Each
group member is to examine their assigned group’s provided data recording the
frequency of the first leading digit
and the second leading digit in the tally
tables provided at the end of this sheet or in one of their own design. For
example, if the number is 234,670 the first leading digit is “2” while the
second leading digit is “3”.
Once the frequencies have been recorded,
create a bar graph with the first leading digit values as the horizontal
items and the vertical scale as a percent. Repeat with a separate graph using
the second leading digits as the
horizontal items (Note: Zero is now valid).
Group 1: The River Runners.
|
Keith |
Hsiao |
Kaatz |
Your group gets the
pleasure of examining the lengths of
http://www.mc.maricopa.edu/~dschultz/Rivers.pdf
Group 2: The People Counters.
|
Savage |
Ascheman |
Nelson |
Your group gets the
pleasure of examining recent population estimates of the World’s countries.
http://www.mc.maricopa.edu/~dschultz/Populations&Area.pdf
Group 3: The Area Reckoners.
|
Handl |
VanDeKop |
Mount |
Your group gets the
pleasure of examining land area of the World’s countries.
http://www.mc.maricopa.edu/~dschultz/Populations&Area.pdf
Group 4: The Stock Brokers:
Show me the money! Your group gets the pleasure of examining the
trading volume of any 100 companies in either the
|
Briggs |
Ducheneaux |
McNickle |
Group 5:The Mystery Machine Group:
Your group gets the
pleasure of examining financial statements of an unnamed HMO. Choose a starting
point and do 100.
|
Fattic |
VanVleet |
|
http://www.mc.maricopa.edu/~dschultz/Financial%20Data.pdf
Group 6:The Group from
Your group gets the
odd assignment of examining the first 100 numbers of the Fibonacci Sequence! Ignore the prime factorizations.
|
Block |
Graham |
Atkins |
http://www.mc.maricopa.edu/~dschultz/Wabbits!.pdf
Task
2. Within the appropriate Discussion Folder discuss your
results. What were the frequencies of the digits? How do your bar graphs look? Anything surprising? Have an initial
post and a response to the other two members in your group by……
Task 3.In
the folder Digits, the first individual name appearing in your group of three
is to submit their first leading digits data
into the “Class Pool” table which is found in that folder. Simply add in how
many 1’s, 2’s, 3’s, etc you counted. Each group needs to have their
contribution added in by ……
Task 4.Everyone
is now to create 1 large bar graph using the pooled first leading digits data from task 3. Examine this graph
and answer the following:
a. Does the class graph appear similar to your
own individual graph? Explain.
b. What is the probability that a digit chosen
at random from 1 to 9 is a 2? Does this seem to fit the class’s graph results?
In fact do any of the graph frequencies appear to fit the posed probability
question you just answered?
c. Try to fit a function to the class’s bar
graph using one of the standard regression curve options found on a graphing
calculator (i.e. linear, quadratic, logistic, etc.).
d. Explain why you think the pooled frequencies
may be what they are. I am not looking
for a proof but an argument.
Task
5.Imagine that you deposit $1,000 in a bank and receive 10% compound
interest per year. The next
year you'll have $1,100, the year after that $1,210, then $1,331, and so on.
Approximately how many years will it take until your account finally has a “2”
as its first leading digit? Approximately how many years would it take to go
from a leading digit of “2” to a “3”? Approximately how many years would it
take to go from a leading digit of “3” to a “4”? Explain how the various time
factors might help answer Task 4 part “d”.
Task 6.Using a number line from 0 to 1, divide it by the frequency percents of the pooled first digit data from task 3. For example, if the digit “1” appeared 20% of the time, designate that on the graph by shading up to .20. Continue adding each frequency until the number line from 0 to 1 has been completed. A fictitious example is shown below.
Leading
First Digit Percents

Task 7.Compare the number line you created in task 6
to the one shown below. Write some general observations.

Task 8. From the
picture in task 7 we see that determining the probability that the leading
digit is a “1” P(1), is the same as computing the probability that our data
number on a log base 10 scale is between log(1) = 0 and log(2). To do this we
compute the integral shown below.

So, we conclude that the leading digit has about a 30% probability of being
a “1”. Compute the probabilities of the other 8 digits using the same
integration technique. Record your results in tabular form for all 9 digits (1
– 9). Next, derive the general model which was formally proven in 1996 but
posed much earlier by following the notation and then computing the final
integral:

Finally, show that your general formula is algebraically equivalent to
the more commonly used form shown below:

Task 9. Write a few lines about this modeling
experience noting the things you found interesting or surprising. Was this new
to you? Could you see junior high students creating bar-graphs with some data
and being surprised?
Task 10. There are
several very good and readable articles concerning this modeling phenomenon. I
purposely have not mentioned the names associated with it to ensure your
discovery. There will be a live “link” posted on ….. at …… Read one of the short articles and submit a 1/2 page summary on it noting the important features
in a WebCt email to David by….

Data Analysis Sheet
|
First Leading Digit |
Tally |
% of Total Data Values |
|
1 |
|
|
|
2 |
|
|
|
3 |
|
|
|
4 |
|
|
|
5 |
|
|
|
6 |
|
|
|
7 |
|
|
|
8 |
|
|
|
9 |
|
|
|
Second Leading Digit |
Tally |
% of Total Data Values |
|
0 |
|
|
|
1 |
|
|
|
2 |
|
|
|
3 |
|
|
|
4 |
|
|
|
5 |
|
|
|
6 |
|
|
|
7 |
|
|
|
8 |
|
|
|
9 |
|
|