Fraudulent Digits and Bogus Numbers

 

            Today’s world is characterized by millions of electronic data transfers, financial submissions, and other numerical sets. It is nothing to swipe a card in an ATM and walk away with hundreds of dollars in cash or to electronically submit the corporate earnings of thousands of companies in a flash. With such speed and efficiency comes the risk of fraud. But how does one recognize legitimate numbers from bogus ones? In 1993, in my home state of Arizona, James Nelson was accused and found guilty of defrauding the State nearly 2 million dollars. Nelson was a manager in the office of the Arizona State Treasurer and diverted 23 checks for his own personal use. Here is a listing of those checks:

 

 

            So how was the prosecution able to prove that these checks were fraudulent? Welcome to the mathematical field of Digital Analysis. Digital analysis has become standard practice for CPA’s and financial investigators to detect fraudulent corporate data, income tax returns, and was used in validating the solutions to the anticipated Y2K problems. It is now beginning to find its way into other arenas of applications, such as, students cheating! Anyways, let’s explore some data!

 

 

 

 

 

            In order to introduce you to an amazing observation that first took place back in 1881, revisited again in 1938, yet left unproven until 1996, we will examine some data.

 

Your Written Submission Responsibilities

 

To make sure you have done all the requirements of this project this is what you will need to submit to David by …… in a Word Document attachment.

 

  1. The two digit/frequency tables and their respective bar graphs from task 1.
  2. Write a short paragraph summarizing your group’s comments from task 2.
  3. Create the pooled data bar graph and then answer the accompanying questions from task 4.
  4. From task 5, show the time computations asked for and answer the questions posed.
  5. Create the number line asked for in task 6 using the fictitious example as a guide.
  6. Write a few sentences comparing your created number line compared to the one shown in task 7.
  7. Compute all the desired probabilities and integrals from task 8 arriving finally at the accepted model used in the real-world.
  8. Answer the questions in Task 9.
  9. See task 10 for details.

 

Task1. Each group member is to examine their assigned group’s provided data recording the frequency of the first leading digit and the second leading digit in the tally tables provided at the end of this sheet or in one of their own design. For example, if the number is 234,670 the first leading digit is “2” while the second leading digit is “3”.

Once the frequencies have been recorded, create a bar graph with the first leading digit values as the horizontal items and the vertical scale as a percent. Repeat with a separate graph using the second leading digits as the horizontal items (Note: Zero is now valid).

 

Group 1: The River Runners.

 

Keith

Hsiao

Kaatz

 

Your group gets the pleasure of examining the lengths of America’s and the World’s longest rivers.

 

http://www.mc.maricopa.edu/~dschultz/Rivers.pdf

 

 

Group 2: The People Counters.

 

Savage

Ascheman

Nelson

 

 

Your group gets the pleasure of examining recent population estimates of the World’s countries.

 

http://www.mc.maricopa.edu/~dschultz/Populations&Area.pdf

 

Group 3: The Area Reckoners.

 

Handl

VanDeKop

Mount

 

 

Your group gets the pleasure of examining land area of the World’s countries.

 

http://www.mc.maricopa.edu/~dschultz/Populations&Area.pdf

 

 

Group 4: The Stock Brokers:

 

Show me the money! Your group gets the pleasure of examining the trading volume of any 100 companies in either the New York Stock Exchange or the NASDAQ. Pick a starting point from a newspaper’s business section and then choose the next 100 consecutive company trading volumes. The day is irrelevant (as long as it’s a trading day!)

 

Briggs

Ducheneaux

McNickle

 

 

Group 5:The Mystery Machine Group:

 

Your group gets the pleasure of examining financial statements of an unnamed HMO. Choose a starting point and do 100.

 

Fattic

VanVleet

Hammond

 

 

http://www.mc.maricopa.edu/~dschultz/Financial%20Data.pdf

 

 

Group 6:The Group from Pisa:

 

Your group gets the odd assignment of examining the first 100 numbers of the Fibonacci Sequence! Ignore the prime factorizations.

 

 

Block

Graham

Atkins

 

http://www.mc.maricopa.edu/~dschultz/Wabbits!.pdf

 

 

Task 2. Within the appropriate Discussion Folder discuss your results. What were the frequencies of the digits? How do your bar graphs look? Anything surprising? Have an initial post and a response to the other two members in your group by……

 

Task 3.In the folder Digits, the first individual name appearing in your group of three is to submit their first leading digits data into the “Class Pool” table which is found in that folder. Simply add in how many 1’s, 2’s, 3’s, etc you counted. Each group needs to have their contribution added in by ……

 

Task 4.Everyone is now to create 1 large bar graph using the pooled first leading digits data from task 3. Examine this graph and answer the following:

a.      Does the class graph appear similar to your own individual graph? Explain.

b.     What is the probability that a digit chosen at random from 1 to 9 is a 2? Does this seem to fit the class’s graph results? In fact do any of the graph frequencies appear to fit the posed probability question you just answered?

c.      Try to fit a function to the class’s bar graph using one of the standard regression curve options found on a graphing calculator (i.e. linear, quadratic, logistic, etc.).

d.      Explain why you think the pooled frequencies may be what   they are. I am not looking for a proof but an argument.

 

Task 5.Imagine that you deposit $1,000 in a bank and receive 10% compound interest per year. The next year you'll have $1,100, the year after that $1,210, then $1,331, and so on. Approximately how many years will it take until your account finally has a “2” as its first leading digit? Approximately how many years would it take to go from a leading digit of “2” to a “3”? Approximately how many years would it take to go from a leading digit of “3” to a “4”? Explain how the various time factors might help answer Task 4 part “d”.

 

Task 6.Using a number line from 0 to 1, divide it by the frequency percents of the pooled first digit data from task 3. For example, if the digit “1” appeared 20% of the time, designate that on the graph by shading up to .20. Continue adding each frequency until the number line from 0 to 1 has been completed. A fictitious example is shown below.

 

Leading First Digit Percents

 

Task 7.Compare the number line you created in task 6 to the one shown below. Write some general observations.

 

 

 

Task 8. From the picture in task 7 we see that determining the probability that the leading digit is a “1” P(1), is the same as computing the probability that our data number on a log base 10 scale is between log(1) = 0 and log(2). To do this we compute the integral shown below.

 

 

So, we conclude that the leading digit has about a 30% probability of being a “1”. Compute the probabilities of the other 8 digits using the same integration technique. Record your results in tabular form for all 9 digits (1 – 9). Next, derive the general model which was formally proven in 1996 but posed much earlier by following the notation and then computing the final integral:

 

 

Finally, show that your general formula is algebraically equivalent to the more commonly used form shown below:

 

 

Task 9. Write a few lines about this modeling experience noting the things you found interesting or surprising. Was this new to you? Could you see junior high students creating bar-graphs with some data and being surprised?

 

Task 10. There are several very good and readable articles concerning this modeling phenomenon. I purposely have not mentioned the names associated with it to ensure your discovery. There will be a live “link” posted on ….. at …… Read one of the short articles and submit a 1/2 page summary on it noting the important features in a WebCt email to David by….

 

 

 

 

 

 

 

 

 

 

Data Analysis Sheet

 

 

First Leading Digit

Tally

% of Total Data Values

1

 

 

2

 

 

3

 

 

4

 

 

5

 

 

6

 

 

7

 

 

8

 

 

9

 

 

 

 

 

Second Leading Digit

Tally

% of Total Data Values

0

 

 

1

 

 

2

 

 

3

 

 

4

 

 

5

 

 

6

 

 

7

 

 

8

 

 

9