I Need Assistance With

In [3] : import sqlite3 as db
import pandas as pd
from datetime import datetime
from collections import defaultdict
disk_engine = db. connect ( ‘ file: prob0 . db?mode=ro’, uri=True)
def load_data( ) :
df = pd. read_sql_query ( quot;SELECT * FROM soccer_resultsquot;, disk_engine)
return df
In [4]: # Test: Exercise 0 (exposed)
df = load_data ( )
assert df . shape [ 0] == 22851, quot;Row counts do not match. Try loading the data againquot;
assert df . shape[1] == 9, quot;You don’t have all the columns. Try loading the data againquot;
print ( quot;\n (Passed! ) quot;)
df . head ( )
(Passed! )
Out [ 4 ] :
date home_team away_team home_score away_score tournament
city
country neutral
0 1994-01-02
Barbados
Grenada
0
Friendly Bridgetown Barbados FALSE
1994-01-02
Ghana
Egypt
– N
Friendly
Accra
a FALSE
2 1994-01-05
Mali Burkina Faso
Friendly
Bamako
Mali FALSE
3 1994-01-09
Mauritania
Mali
3
Friendly Nouakchott Mauritania FALSE
1994-01-11
Thailand
Nigeria
Friendly
Bangkok
Thailand FALSE
Each row of this dataframe is a game, which is played between a quot;home teamquot; (column home_team) and an quot;away teamquot; (away_team). The number of goals
scored by each team appears in the home_score and away_score columns, respectively.
Exercise 1 (1 point): Write an SQL query find the ten (10) teams that have the highest average away-scores since the year 2000. Your query should satisfy the
following criteria
. It should return two columns:
. team: The name of the team
ave_goals: The team’s average number of goals in quot;awayquot; games. An quot;away gamequot; is one in which the team’s name appars in away_team and the
game takes place at a quot;non-neutral sitequot; (neutral value equals FALSE).
It should only include teams that have played at least 30 away matches
. It should round the average goals value (ave_goals) to three decimal places.
It should only return the top 10 teams in descending order by average away-goals.
. It should only consider games played since 2000 (including the year 2000).
Store your query string as the variable, query_top10_away, below. The test cell will run this query string against the input dataframe, df, defined above and
return the result in a dataframe named offensive_teams. (See the test cell.)
Note. The following exercises have hidden test cases and you’ll be awarded full points for passing both the exposed and hidden test cases.
In [5]: query_top10_away =
SELECT away_team, ROUND ( avg (away_score) , 3)
FROM data
WHERE neutral is false and strftime( ‘ $H’, date) ; gt;= 2000
GROUP BY away_team HAVING count ( *) gt;= 30
ORDER BY ROUND ( avg ( away_score) , 3)
DESC LIMIT 10;Python Programming