LLM SQL Generation Benchmark Results

We assessed the ability of popular LLMs to generate accurate and efficient SQL from natural language prompts. Using a 200 million record dataset from the GH Archive uploaded to Tinybird, we asked the LLMs to generate SQL based on 50 prompts. The results are shown below and can be compared to a human baseline.

--
human
human
--
--
--
--
--
332.6 ms
31,006,852
759.83 MB
#1
anthropic
80.41
96.74
64.07
4.243
1.97
352.457 ms
28,250,540
112.14 MB
#2
google
78.17
98.37
57.98
19.100
1.06
443.22 ms
42,878,115
826.88 MB
#3
anthropic
77.38
98.67
56.08
3.149
1.10
374.224 ms
40,099,998
824.57 MB
#4
moonshotai
77.09
98.42
55.77
4.265
1.06
589.22 ms
49,539,148
903.11 MB
#5
openrouter
77.00
97.22
56.78
1.401
1.02
1,202.46 ms
72,041,510
1,141.62 MB
#6
openai
76.69
98.55
54.83
9.886
1.14
448.84 ms
49,432,133
844.29 MB
#7
qwen
76.55
92.49
60.61
5.172
1.17
550.043 ms
40,419,858
895.38 MB
#8
deepseek
76.52
99.01
54.02
3.914
1.04
608.681 ms
45,651,463
322.46 MB
#9
qwen
76.14
98.17
54.11
8.620
1.18
397.755 ms
38,751,330
781.80 MB
#10
anthropic
75.85
99.28
52.41
3.234
1.02
388.96 ms
37,145,042
684.44 MB
#11
qwen
75.54
99.60
51.48
4.571
1.04
457.224 ms
46,666,126
333.42 MB
#12
x-ai
75.51
95.21
55.81
61.602
1.00
677.06 ms
49,360,869
1,145.95 MB
#13
qwen
75.41
99.43
51.39
2.303
1.02
679.163 ms
53,829,416
387.08 MB
#14
anthropic
75.40
97.16
53.65
6.342
1.04
580.51 ms
39,294,543
936.76 MB
#15
qwen
75.30
95.65
54.95
37.553
1.06
761.347 ms
44,676,197
795.72 MB
#16
openai
74.98
99.88
50.08
2.074
1.00
421.6 ms
52,027,773
246.69 MB
#17
openai
74.77
97.84
51.69
10.228
1.08
613.66 ms
52,581,751
940.75 MB
#18
openai
74.73
99.62
49.84
16.292
1.04
549.54 ms
53,315,039
303.04 MB
#19
openai
74.65
98.79
50.51
2.955
1.00
442.98 ms
41,636,677
756.28 MB
#20
qwen
74.55
99.11
50.00
1.474
1.00
421.14 ms
38,561,447
879.98 MB
#21
anthropic
74.25
96.08
52.41
3.915
1.02
492.708 ms
41,642,822
913.54 MB
#22
openai
74.22
95.85
52.59
76.620
1.04
746.8 ms
52,804,037
936.55 MB
#23
openai
74.20
97.39
51.01
11.124
1.00
596.52 ms
48,389,329
1,097.99 MB
#24
deepseek
74.18
99.25
49.10
5.366
1.24
362.62 ms
39,914,537
612.03 MB
#25
anthropic
74.13
98.43
49.84
3.702
1.02
585.62 ms
43,365,288
907.62 MB
#26
openrouter
74.10
96.81
51.40
1.362
1.02
1,358.24 ms
67,797,316
1,137.85 MB
#27
x-ai
74.00
97.48
50.52
7.127
1.06
651.74 ms
55,296,404
869.75 MB
#28
openai
73.98
97.95
50.00
2.190
1.06
818.38 ms
54,736,481
995.56 MB
#29
qwen
73.82
98.30
49.34
36.262
1.04
439.38 ms
45,468,824
791.67 MB
#30
openai
73.75
97.24
50.26
21.133
1.04
702.64 ms
68,364,075
1,005.01 MB
#31
mistralai
73.72
98.64
48.81
2.088
1.04
666.02 ms
53,051,447
878.95 MB
#32
meta-llama
73.69
99.09
48.30
3.095
1.04
410.78 ms
40,161,866
793.26 MB
#33
qwen
73.56
95.17
51.96
17.344
1.02
720.837 ms
54,897,195
1,106.61 MB
#34
qwen
73.25
98.01
48.49
2.456
1.08
732.878 ms
46,841,414
767.00 MB
#35
mistralai
73.21
99.19
47.22
0.855
1.00
775.14 ms
42,657,411
620.15 MB
#36
thedrummer
73.16
98.63
47.68
1.966
1.10
412.306 ms
36,265,794
823.24 MB
#37
google
73.08
97.83
48.32
20.782
1.04
579.36 ms
38,815,820
806.77 MB
#38
google
72.98
99.87
46.09
2.126
1.02
337.4 ms
36,295,667
262.45 MB
#39
anthropic
72.84
99.75
45.93
2.731
1.08
522.38 ms
47,370,988
297.58 MB
#40
x-ai
72.81
95.48
50.15
10.611
1.00
762.122 ms
50,156,062
1,088.51 MB
#41
qwen
72.64
98.72
46.56
2.453
1.02
556.74 ms
42,185,121
868.81 MB
#42
qwen
72.50
97.70
47.30
19.002
1.02
602.48 ms
45,928,106
890.50 MB
#43
meta-llama
72.44
99.92
44.96
2.048
1.04
289.875 ms
39,101,618
134.66 MB
#44
openai
72.34
99.85
44.83
2.145
1.04
690.28 ms
54,131,214
193.58 MB
#45
deepseek
72.12
90.83
53.41
5.875
1.11
383.682 ms
38,010,973
813.72 MB
#46
openai
71.94
95.20
48.68
3.205
1.09
533.957 ms
41,234,766
980.50 MB
#47
x-ai
71.70
96.94
46.46
6.570
1.02
830.98 ms
51,488,460
1,077.86 MB
#48
google
71.68
95.52
47.83
39.798
1.10
686.857 ms
53,855,819
893.51 MB
#49
x-ai
71.36
97.96
44.76
1.701
1.04
633.612 ms
42,572,577
720.40 MB
#50
google
70.96
99.87
42.04
1.426
1.02
350.146 ms
44,547,543
181.54 MB
#51
nvidia
70.87
97.15
44.59
12.717
1.33
483.347 ms
40,823,966
813.29 MB
#52
openai
70.58
96.52
44.64
25.613
1.04
643.3 ms
61,356,069
1,161.59 MB
#53
mistralai
70.58
99.08
42.08
1.405
1.08
420.714 ms
44,380,748
715.11 MB
#54
qwen
70.36
99.27
41.44
2.806
1.09
308 ms
31,184,916
374.88 MB
#55
google
69.52
99.14
39.90
1.622
1.00
384.551 ms
42,309,547
735.32 MB
#56
mistralai
69.41
97.86
40.96
12.425
1.18
522.531 ms
39,072,130
681.69 MB
#57
mistralai
68.35
97.12
39.58
2.412
1.09
372.696 ms
44,597,846
757.05 MB
#58
google
68.30
99.79
36.81
0.962
1.04
495.78 ms
44,671,754
328.20 MB
#59
openai
67.86
93.60
42.12
27.104
1.08
1,015.46 ms
45,844,074
1,126.98 MB
#60
mistralai
67.11
99.36
34.86
0.925
1.00
385.911 ms
40,043,041
257.63 MB
#61
openai
65.66
94.11
37.20
20.092
1.02
906.12 ms
61,641,565
1,386.39 MB
#62
mistralai
64.38
98.77
30.00
3.307
1.09
680.644 ms
48,641,279
222.69 MB
#63
openai
63.88
99.83
27.93
1.538
1.06
445.694 ms
52,428,071
239.26 MB
#64
mistralai
57.69
71.36
44.02
1.809
1.00
376.5 ms
37,893,118
912.60 MB
#65
meta-llama
40.59
45.63
35.56
3.501
1.21
445.242 ms
38,658,489
992.39 MB
#66
alibaba
27.14
0.00
54.29
20.128
1.13
553.026 ms
57,570,886
1,153.90 MB