Some algorithms

I applied for a job at Amazon a while ago, and failed the online AMCAT test. I can’t remember the exacts of this one given problem, but more or less it’s something similar to this one:

Given 2 lists of numbers and a maximum, find the maximum of sum of a number from list1 + a number from list2 that is smaller or equal to the given maximum, as well as the number of occurrences.

Since I failed and didn’t pass the test suite given by Amazon, I decided to spend some time hacking away and tinkering with algorithms here: https://github.com/ragnarokatz/algorithm. My goal here is to create a more efficient algorithm to solve this problem, because quite possibly one of the reasons I didn’t pass the test suite is that the amount of time required to compute exceeded the amount of time allowed by AMCAT.

Here’s the pure bash version:

import time
import random


start_time = time.time()
random.seed(0)

ROUTES_TO = [random.randint(0, 10000) for n in range(10000)]
ROUTES_BACK = [random.randint(0, 10000) for n in range(10000)]
MAX = 10000

count = 0
current_max = 0

for i in range(len(ROUTES_TO)):
    for j in range(len(ROUTES_BACK)):
        total = ROUTES_TO[i] + ROUTES_BACK[j]
        if total > MAX:
            continue
        if total > current_max:
            current_max = total
            count = 1
        elif total == current_max:
            count += 1

end_time = time.time()

print (f'upper limit set to = {MAX}')
print (f'current maximum possible = {current_max}, number of combinations = {count}')
print( f'time elapsed = {round(end_time - start_time, 2)}')

I’ve set the random seed to be the same in every single case in order to come up with the exact same answers in all the algorithms. The max is set to 10000. The first version is just a double loop through the lists, recording a new maximum whenever encountered. It takes about 25.28s to finish on my computer.

Improved version, sorted from small to large:

import time
import random


start_time = time.time()
random.seed(0)

MAX = 10000
ROUTES_TO = [random.randint(0, 10000) for n in range(10000)]
ROUTES_BACK = [random.randint(0, 10000) for n in range(10000)]

ROUTES_TO.sort()
ROUTES_BACK.sort()

count = 0
current_max = 0


for i in range(len(ROUTES_TO)):
    for j in range(len(ROUTES_BACK)):
        total = ROUTES_TO[i] + ROUTES_BACK[j]
        if total > MAX:
            break
        if total > current_max:
            current_max = total
            count = 1
        elif total == current_max:
            count += 1

end_time = time.time()

print (f'upper limit set to = {MAX}')
print (f'current maximum possible = {current_max}, number of combinations = {count}')
print( f'time elapsed = {round(end_time - start_time, 2)}')

The second version sorts the lists in an ascending fashion, then starts iterating through with the smallest. Whenever a sum is detected to exceed the maximum, it breaks the current iteration. Since if a number that is smaller than the next one already exceeds the maximum, then surely the next number will exceed the maximum as well. This one took 14.56s.

Even better version: sorted from large to small:

import time
import random


start_time = time.time()
random.seed(0)

MAX = 10000
ROUTES_TO = [random.randint(0, 10000) for n in range(10000)]
ROUTES_BACK = [random.randint(0, 10000) for n in range(10000)]

ROUTES_TO.sort(reverse=True)
ROUTES_BACK.sort(reverse=True)

count = 0
current_max = 0


for i in range(len(ROUTES_TO)):
    for j in range(len(ROUTES_BACK)):
        total = ROUTES_TO[i] + ROUTES_BACK[j]
        if total > MAX:
            continue
        if total < current_max:
            break
        elif total == current_max:
            count += 1
        else:
            current_max = total
            count = 1

end_time = time.time()

print (f'upper limit set to = {MAX}')
print (f'current maximum possible = {current_max}, number of combinations = {count}')
print( f'time elapsed = {round(end_time - start_time, 2)}')

This version sorts the lists in descending fashion first. Starting with the biggest sum that is smaller than the given maximum, it will break the current iteration if a smaller sum is detected. If a maximum has already been produced and the next number is smaller, then surely the sum will not exceed the current maximum. It took 8.89s to finish.

This is all I can come up with at the moment off the top of my head. I’m planning on looking into some dynamic programming ideas and see if I can incorporate any of those into the current solution. Here’s a useful link if you are interested: https://www.youtube.com/watch?v=vYquumk4nWw

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Create your website at WordPress.com
Get started
%d bloggers like this: