一些来自LeetCode的Shell和Sql问题

Shell和Sql都是日常工作中经常使用的工具,发现LeetCode还有一些ShellSql问题,由于项目中写Shell和Sql不是很多,练习的时候有些题目还稍微遇到了一点困难,记录如下。

Shell

192. Word Frequency

难度:Medium
题目大意:

  • 写一个bash脚本计算word.txt文件中每个单词出现的频率
  • 只存在小写字母单词表,单词只使用一个或是多个空格分割
  • 输出频率时候按照降序排列,由于保证单词的频率相互不重复,不用考虑再使用字典序做二次排序

示例输入:

1
2
the day is sunny the the
the sunny is is

示例输出:

1
2
3
4
the 4
is 3
sunny 2
day 1

使用sort/uniq计算频率

1
cat words.txt | tr -s ' ' '\n' | sort | uniq -c | sort -r | awk '{ print $2, $1  }'
  • tr -s : 将空格替换为换行,其中-s --squeeze-repeats指明将连续的空格视为一个空格
  • sort : 按照字典序排序,为统计频率做准备,因为uniq只能统计连续出现的单词
  • uniq -c : 将排序后单词序列中连续出现的重复单词去重,-c将重复次数输出
  • sort -r : 按照频率降序排列结果
  • awk : 将输出结果的第一栏和第二栏调换输出

仅仅使用awk和sort

awk功能非常强大。

1
awk '{for(i=1;i<=NF;i++) a[$i]++} END {for(k in a) print k,a[k]}' words.txt | sort -k2 -nr

  • awk的对于每一行生效的指令中,对于每一行的每一栏的单词,将他们统计到map中
  • awk最后指令,将单词与频率输出
  • sort -k2 -nr : 使用第二栏的数据,进行降序排序

194. Transpose File

难度:Medium
题目大意:

  • 写一个bash脚本转置file.txt给出单词的矩阵
  • 单词只使用一个空格分割

示例输入:

1
2
3
name age
alice 21
ryan 30

示例输出:

1
2
name alice ryan
age 21 30

awk

还是继续awk大法好,awk其实类似于一个文件处理的编程框架。
awk和sed使用好,比直接写shell和使用各种命令进行管道叠加要效率高并且可读性高。

1
2
3
4
5
6
7
8
9
10
11
12
13
awk '{
row=NR;
col=NF;
for(i=1;i<=NF;i++) a[NR, i]=$i;
}
END {
for(i=1;i<=col;i++) {
printf("%s", a[1, i]);
for(j=2;j<=row;j++)
printf(" %s", a[j, i]);
print "";
}
}' file.txt

直接将单词矩阵存下来,然后转置输出。

Sql

176. Second Highest Salary

难度:Easy
题目大意:

  • Employee表中选出第二高的薪水记录
  • 如果不存在第二高的记录,返回null
1
2
3
4
5
6
7
+----+--------+
| Id | Salary |
+----+--------+

| 1 | 100 |
| 2 | 200 |
| 3 | 300 |
+----+--------+

1
2
3
select ifnull(
(select distinct Salary from Employee order by Salary desc limit 1,1)
, null) as SecondHighestSalary

使用ifnull函数进行记录是否为空的判断。

178. Rank Scores

难度:Medium
题目大意:

  • 用Sql来为分数分级,对于相同的分数,分为同一级,下一级和前一级别连续编号。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
+----+-------+
| Id | Score |
+----+-------+

| 1 | 3.50 |
| 2 | 3.65 |
| 3 | 4.00 |
| 4 | 3.85 |
| 5 | 4.00 |
| 6 | 3.65 |
+----+-------+


输出:
+-------+------+

| Score | Rank |
+-------+------+

| 4.00 | 1 |
| 4.00 | 1 |
| 3.85 | 2 |
| 3.65 | 3 |
| 3.65 | 3 |
| 3.50 | 4 |
+-------+------+

1
2
3
4
5
6
select Score,
( @rank := @rank + ( @prev <> (@prev := Score) ) ) as Rank
from
(
select Score from Scores order by Score desc
) A, (select @rank := 0, @prev := -1) Init

参考他人解法,使用零时表中的两个变量,分别记录分级编号和上一个分数,方便判断是否分数相同。

180. Consecutive Numbers

难度:Medium
题目大意:

  • 用Sql选出连续出现三次即以上的数字
1
2
3
4
5
6
7
8
9
10
11
+----+-----+
| Id | Num |
+----+-----+
| 1 | 1 |
| 2 | 1 |
| 3 | 1 |
| 4 | 2 |
| 5 | 1 |
| 6 | 2 |
| 7 | 2 |
+----+-----+
1
2
3
4
select distinct Num as ConsecutiveNums from (
select Id, Num, ( @rank := if(@prev = (@prev := NUM), @rank+1, 1) ) as Rank
from Logs, (select @rank := 1, @prev := -1) Init
) as temp where Rank = 3;

参考178的思路,将每条记录按照是否连续进行编号,再把所有有三次连续出现的不重复数字选出。

184. Department Highest Salary

难度:Medium
题目大意:

  • 挑选出每个部门最高薪水的记录
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28

Employee:
+----+-------+--------+--------------+

| Id | Name | Salary | DepartmentId |
+----+-------+--------+--------------+

| 1 | Joe | 70000 | 1 |
| 2 | Henry | 80000 | 2 |
| 3 | Sam | 60000 | 2 |
| 4 | Max | 90000 | 1 |
| 5 | Janet | 69000 | 1 |
| 6 | Randy | 85000 | 1 |
+----+-------+--------+--------------+


Department:
+----+----------+

| Id | Name |
+----+----------+

| 1 | IT |
| 2 | Sales |
+----+----------+


输出:
+------------+----------+--------+

| Department | Employee | Salary |
+------------+----------+--------+

| IT | Max | 90000 |
| Sales | Henry | 80000 |
+------------+----------+--------+

解法1:900ms

1
2
3
select d.name as Department, e.Name as Employee, Salary
from Employee e inner join Department d on e.DepartmentId = d.Id
where Salary = (select max(Salary) from Employee where DepartmentId = d.Id)

比较暴力,有个Employee记录都有一个冗余的子查询。

解法2:600ms

1
2
3
4
5
6
7
8
select t3.Name as Department, t2.Name as Employee, t1.Salary as Salary
from
(
select max(Salary) as Salary, DepartmentId
from Employee
group by DepartmentId
) t1, Employee t2, Department t3
where t1.Salary = t2.Salary and t1.DepartmentId = t2.DepartmentId and t1.DepartmentId = t3.Id

先把每个部门的最高薪水一起聚合算出,再全连接表并筛选出部门和薪水都匹配的员工记录。

185. Department Top Three Salaries

难度:Hard
题目大意:

  • 在上一题基础上,挑选出每个部门前三高薪水的记录
  • 如果少于三条记录,则只显示存在的记录
1
2
3
4
5
6
7
8
9
select t1.Name as Department, t2.Name as Employee, t2.Salary as Salary
from(
select Name, ifnull((
select distinct Salary from Employee e
where e.DepartmentId = d.Id
order by Salary desc limit 2,1)
, 0) as Salary, Id as DepartmentId from Department d
) t1, Employee t2
where t1.Salary <= t2.Salary and t1.DepartmentId = t2.DepartmentId

和上一题思路类似,先选出第三高的薪水,再全连接筛选。