Results of the public multiformat listening test (July 2014) DRAFT 2


These are the DRAFT summary results of the multiformat listening test.

You can NOT download a ZIP file containing all results for all samples.

How to interpret the charts: Each chart is drawn with 6 codecs on the X axis and the rating given (1.0 to 5.0) on the Y axis. The mean rating given to each codec is indicated by the middle point of each vertical I-shaped line segment. Each I-shaped segment represents the 95% confidence interval of the mean rating (using bootstrap analysis) for each codec. This analysis is almost identical to the one used in previous listening tests.

Important note: These plots represent group preferences (for the particular group of people who participated in the test). Individual preferences vary somewhat. The best codec for a person is dependent on his/her own preferences and the type of music he/she prefers.


Plot of the complete result (40 samples, 1234 results):

Full plot

Closeup of the interesting results (40 samples, 1234 results):

Zoomed plot

Per-sample results

A page with graphics for each sample individually is here.

Bitrate table

The codecs and settings were calibrated to provide ~96kbps on a large variety of music, except for the MP3 and anchors.

Codec bitrates

These are the bitrates used by the codes for the samples in the test:

	Sample	Length	AAC	Opus	Ogg	MP3 V5	FAAC 96	FAAC q30
	--------------------------------------------------------------------------
	1	18.191 	103	82	97	144	98	58	SinceAlways
	2	20.155 	106	111	107	137	98	54	Waiting
	3	11.879 	114	112	124	169	98	55	velvet
	4	10.031 	101	133	101	156	98	60	trumpet
	5	15.957 	101	122	134	155	98	51	girl
	6	29.933 	99	116	111	144	97	54	Can't Wait Until Tonight (Dry Wurlitzer Mix)
	7	23.450 	85	131	79	77	83	41	35_SQAM_glockenspiel_cut
	8	30.000 	108	100	105	128	98	44	Robots_old
									
	9	20.000 	109	103	116	148	98	48	Asleep__4.11-4.31_
	10	20.178 	100	103	89	120	98	54	Greatest_Love_of_All_2min57
	11	20.000 	102	92	96	131	98	55	Hey Tonight
	12	20.000 	100	120	94	131	98	54	Severance__1.31-1.51_
	13	18.924 	109	104	88	108	98	48	Shinsho_pool_3min45_4min4
	14	22.000 	100	90	107	131	98	56	SlavesOfFear
	15	20.000 	109	104	122	146	98	49	The Chastising of Renegade
	16	22.304 	99	124	96	127	98	55	TrosYGareg
									
	17	9.028 	95	96	99	121	98	46	4-Sound-English-male.441
	18	9.707 	104	119	113	154	99	48	9-Have-big-expensive-car.441
	19	7.840 	95	100	104	116	98	47	12-German-male-speech.441
	20	8.566 	99	94	100	129	98	52	15-Good-evening.441
	21	9.656 	105	112	89	133	98	52	21-classic.441
	22	8.720 	101	133	121	176	99	45	24-Greensleeves-Korean-male-speech.441
	23	9.184 	99	98	95	125	98	53	25-This-is-the-end.441
	24	9.966 	105	104	112	128	99	48	27-last-song-drums-and-trampets.441
									
	25	28.505 	106	101	94	120	98	53	bonhemian_rhapsody
	26	14.986 	107	104	122	145	98	53	clapton_44k
	27	12.701 	107	104	102	135	98	48	Coral
	28	20.106 	101	111	85	141	98	47	ExitMusic
	29	5.007 	116	97	143	160	98	54	liberate
	30	29.849 	103	93	102	131	98	55	NewYorkCity
	31	30.002 	99	97	106	134	98	49	sandman
	32	22.092 	109	114	133	138	98	60	take_your_finger_frin_my_head
									
	33	29.932 	106	113	119	137	98	51	Changes
	34	13.742 	111	121	108	126	98	53	Girl_In_The_Fire__Sample_
	35	30.762 	110	106	112	143	98	52	Hotel California
	36	29.234 	103	105	89	123	98	56	Jupiter, the Bringer of Jolity
	37	25.481 	110	104	114	147	98	48	Last_Of_The_Mohicanz__Sample_
	38	24.671 	110	96	100	130	98	48	Only Time
	39	29.789 	99	97	117	135	98	59	Through The Fire And Flames
	40	16.403 	103	103	99	144	98	56	With Love (Outro)
	--------------------------------------------------------------------------
	Mean	18.973	103.7 	106.7 	106.1 	135.5 	97.6 	51.7
	Unit	second	kbps	kbps	kbps	kbps	kbps	kbps
	Sample	Length	AAC	Opus	Ogg	MP3 V5	FAAC 96	FAAC q30

    

Bootstrap analysis:

	bootstrap.py v1.0 2011-02-03
	Copyright (C) 2011 Gian-Carlo Pascutto 
	License Affero GPL version 3 or later 

	Reading from: results_AAC_2011.txt
	Read 6 treatments, 280 samples => 15 comparisons
	Means:
	    Nero      CVBR      TVBR       FhG        CT  low_anchor
	   3.698     4.391     4.342     4.253     4.039     1.545

	Unadjusted p-values:
		  CVBR      TVBR      FhG       CT        low_anchor
	Nero      0.000*    0.000*    0.000*    0.000*    0.000*
	CVBR      -         0.128     0.002*    0.000*    0.000*
	TVBR      -         -         0.059     0.000*    0.000*
	FhG       -         -         -         0.000*    0.000*
	CT        -         -         -         -         0.000*

	CVBR is better than Nero (p=0.000)
	TVBR is better than Nero (p=0.000)
	FhG is better than Nero (p=0.000)
	FhG is worse than CVBR (p=0.002)
	CT is better than Nero (p=0.000)
	CT is worse than CVBR (p=0.000)
	CT is worse than TVBR (p=0.000)
	CT is worse than FhG (p=0.000)
	low_anchor is worse than Nero (p=0.000)
	low_anchor is worse than CVBR (p=0.000)
	low_anchor is worse than TVBR (p=0.000)
	low_anchor is worse than FhG (p=0.000)
	low_anchor is worse than CT (p=0.000)

	p-values adjusted for multiple comparison:
		  CVBR      TVBR      FhG       CT        low_anchor
	Nero      0.000*    0.000*    0.000*    0.000*    0.000*
	CVBR      -         0.130     0.005*    0.000*    0.000*
	TVBR      -         -         0.107     0.000*    0.000*
	FhG       -         -         -         0.000*    0.000*
	CT        -         -         -         -         0.000*

	CVBR is better than Nero (p=0.000)
	TVBR is better than Nero (p=0.000)
	FhG is better than Nero (p=0.000)
	FhG is worse than CVBR (p=0.005)
	CT is better than Nero (p=0.000)
	CT is worse than CVBR (p=0.000)
	CT is worse than TVBR (p=0.000)
	CT is worse than FhG (p=0.000)
	low_anchor is worse than Nero (p=0.000)
	low_anchor is worse than CVBR (p=0.000)
	low_anchor is worse than TVBR (p=0.000)
	low_anchor is worse than FhG (p=0.000)
	low_anchor is worse than CT (p=0.000)
    

ANOVA analysis:

	FRIEDMAN version 1.24 (Jan 17, 2002) http://ff123.net/
	Blocked ANOVA analysis

	Number of listeners: 280
	Critical significance:  0.05
	Significance of data: 0.00E+00 (highly significant)
	---------------------------------------------------------------
	ANOVA Table for Randomized Block Designs Using Ratings

	Source of         Degrees     Sum of    Mean
	variation         of Freedom  squares   Square    F      p

	Total             1679        3200.32
	Testers (blocks)   279        1020.15
	Codecs eval'd        5        1666.66  333.33   905.53  0.00E+00
	Error             1395         513.51    0.37
	---------------------------------------------------------------
	Fisher's protected LSD for ANOVA:   0.101

	Means:

	CVBR     TVBR     FhG      CT       Nero     low_anch 
	  4.39     4.34     4.25     4.04     3.70     1.55   

	---------------------------- p-value Matrix ---------------------------

		 TVBR     FhG      CT       Nero     low_anch 
	CVBR     0.333    0.007*   0.000*   0.000*   0.000*   
	TVBR              0.084    0.000*   0.000*   0.000*   
	FhG                        0.000*   0.000*   0.000*   
	CT                                  0.000*   0.000*   
	Nero                                         0.000*   
	-----------------------------------------------------------------------

	CVBR is better than FhG, CT, Nero, low_anchor
	TVBR is better than CT, Nero, low_anchor
	FhG is better than CT, Nero, low_anchor
	CT is better than Nero, low_anchor
	Nero is better than low_anchor
    

Post-screening:

Invalid results were discarded according to the following criteria, which were made public at the beginning of the test:

Contact

IgorC: igoruso@gmail.com

Kamedo2: Twitter@kamedo2