YH-925GS and Linux/Ubuntu/Debian

Conclusion: The firmware has no .ogg vorbis support, .mp3/.wma/.aa support only ...regardless of the published specifications. The FM radio tuner chip is physically missing on the .US and .CA models. Shows up as a 20GB USB Mass Storage device, but dragging Audio files does not work, needs a special utility to regenerate the database of song-names. The internal hardware is the same as chip as the iPod-mini, iPod Photo, iRiver H10/H320 and made by PortalPlayer.

A friend of mine got a Samsung YH-925 mp3 player. The main reason for getting that model being that it claims to be able to play OGG Vorbis and Audible (some weird proprietary Audio-book format) files.

It has a colour 160x100 screen, Up/Down/Left/Right buttons aswell as play, stop and the like. Inside there is a 20GB hard disk. I think it also comes in other sizes of 10GB and 5GB with model names YH-920 and something else. It claims to be able to act as a USB host to another USB Mass Storage device such as a pendrive or Digital Camera for copying off and downloading files.

Plugging it into the first Debian box made the USB stack hang and there were feelings that this was probably an expensive Ebay-paperweight. Plugging it into another Ubuntu laptop had it magically pop up on the desktop as an 18.7GB USB Disk. Groovy.

You can drag files onto the gadget just like an external hard-disk drive, but this won't actually help the device to play them. It won't see the audio files unless they are listed in its special database inside the System directory.

The database file is named System/DATA/PP5000.dat with the index files being called System/DATA/PP5000_????.idx and the list of headers is PP5000.hdr

$ hexdump -C System/DATA/PP5000.hdr | head -6
00000000  00 00 00 00 00 00 00 00  53 00 79 00 73 00 74 00  |........S.y.s.t.|
00000010  65 00 6d 00 5c 00 44 00  41 00 54 00 41 00 5c 00  |e.m.\.D.A.T.A.\.|
00000020  50 00 50 00 35 00 30 00  30 00 30 00 2e 00 64 00  |P.P.5.0.0.0...d.|
00000030  61 00 74 00 00 00 00 00  00 00 00 00 00 00 00 00  |a.t.............|

This file contains UCS-2 formatted data (MS Windows stylie Unicode limited to 16-bit of character. For some reason, decoding as UCS-2 didn't work, so lets use UTF-16 and force the byte order (endian):

$ recode UTF-16LE..UTF-8 < System/DATA/PP5000.hdr | strings
System\DATA\PP5000.dat
System\DATA\PP5000.hdr
System\DATA\PP5000_@DEV.idx
System\DATA\PP5000_FPTH.idx
System\DATA\PP5000_FNAM.idx
System\DATA\PP5000_FRMT.idx
System\DATA\PP5000_TPE1.idx
System\DATA\PP5000_TALB.idx
System\DATA\PP5000_TCON.idx
System\DATA\PP5000_TIT2.idx
(LPTX\
(HLPTX

Okay, so from that file we can see all the other parts of the database that the software in the player is expecting to find.

$ cat System/DATA/PP5000.dat | recode UTF-16LE..UTF-8 | strings | head -6
System\MUSIC\
Yepp-1 groove.mp3
Samsung Electronics Co., Ltd.
Yeppie
Funk
Yepp groove

Looks like directory Path, Filename, Copyright, Artist, Genre and Style at a guess. Searching around for abit for 'pp5000.dat' I must have mistyped and eventually came across a site detailing the file-format of the Philips HDD100 and Philips HDD120 Audio Jukeboxes. Bingo, identical with the files called 'db5000' instead of 'pp'. So what does 'PP' stand for?

$ strings FW_YH925.mi4  | head -7
PPOS
portalplayer
PP5020AF-05.11-SM05-02.13-GS01-01.00-DT
2004.11.22
(Build 38)
Digital Media Platform
Copyright(c) 1999 - 2003 PortalPlayer, Inc.  All rights reserved.

PP is therefore PortalPlayer who make various all-in-one System-on-Chip (SoC) designs for portable media players. Wikipedia tells us that the same chip (PP5020) is also behind lots of other media players: iPod mini, iPod Photo (remember the colour screen) and iRiver H10. The '5000' part comes from the series number, so lets login to Wikipedia and add the Samung YH-925 and Philips HDD100 now that we know about them.

To identify this device a bit more specifically, its USB ID is 04e8:5024

$ lsusb -v ...
P:  Vendor=04e8 ProdID=5024 Rev= 0.01
S:  Manufacturer=Samsung
S:  Product=Digital Audio Player

The detour to Sulu. Didn't help!

Samsung had a few previous products called the 'YEPP'. So a bit of Googling bought up a program called Sulu which stands for 'Samsung Uproar Linux Utility', this seems to be based around Microsoft's MTP (Media Transport Protocol) for talking to media players that don't want to show up as a hard-disk. I tried faffing around and patching to add USB ID and input/output endpoints grabbed from lsusb -v. No success, the YH-925 really is a nice simple USB player than still needs its own database filling in everytime music is added, just like the iPod and several others.

Now to start dreaming about iPodLinux and PodZilla... {dreaming}. However, see the comment about finding the bootloader ROM image below!

Playlists

The Playlists seem to be other ways of grouping a load of songs for playing one after each other. They live in Systems/Params/Playlist Name.plp and are nice and easy to start processing. Again, these are stored in UCS-2 form:

cat 'System/Parms/Now Playing.plp' | recode UTF-16LE..UTF-8 | cat -v
PLP PLAYLIST^M
VERSION 1.20^M
^M
HDD, System\MUSIC\Yepp_2_funk.mp3^M
HDD, System\MUSIC\Yepp-1 groove.mp3^M

Which is header followed by a blank line, then various 'HHD, ' then filename arguments, all MS Windows styled carriage return+newline terminated (\r\n).

A suitable workflow for a commandline-operation to put music on the device is just to load it on by hand and then scrape all the MP3/OGG files on the device for the name and ID3 and fill the database. A suitable workflow for a graphical client might be to present it as a special GNOME/kioslave window and allow files to be dragged from and to, display the various columns and updating the database on the go rather than starting from stratch and overwriting the database each time.

But, before we start playing around and breaking things, we need to check that it works with the software under MS Windows so that if it doesn't play .Oggs (the original purpose) it can be sold to somebody else. So, where to find an MS Windows XP machine?...

Windows XP

The device shows up under Windows XP as a Mass Storage Device. Microsoft's media player knows how to upload/download files to the player; I presume this is using their "Media Transfer Protocol" (MTP) system so the player itself maybe updating the database, or it maybe the Windows driver twiddling bits. The Microsoft Media Player didn't want to add OGG files to the device since it doesn't know what they are. I didn't install the Napster client, but this is the likely next step as it is an MTP compatible program and since it was supplied by Samsung with the device should know about Audible and OGG files.

The Samsung Media Studio program seems only to be concerned with photo-editing and transferring tiny thumbnail piccys to/from.

EasyH10

I can't make EasyH10 grok the databases, it wants a ''Model Template'' including in the top-level directory of the device; this file starts with MDEL and seems to have something that looks like the '.hdr' header file appended. This would make sense as the header files contains all the necessary information to generate index files and contains the field descriptions. Despite having the source, I can't work out what is going on since EasyH10 doesn't appear to read the 'MDEL' part, only to write it!

EasyH10 is already somewhat well-designed as it allows templates to be loaded for about a dozen different varieties of iRiver H10 size and firmware combinations.

create-index.py

There is a blog entry about a python utility for the HDD100 which turns out to be very similar; go and fetch:

wget http://kvota.net/hacks/philips-hdd100/{create-index,ID3,mp3}.py

I've had the most success hacking with this. It is written in Python so great for trying things out; but this was originally designed to produce the database files for the Philips player and is purely write-only; it has no idea how to parse the files and instead just uses various hard-coded pointers. (Which are different for the YH-925GS).

By writing something that actually understands the format of the '.hdr' file it should be possible to have something that generically works with most player types, regardless of whether they add or remove fields.

EasyH10 has a fairly full specification of their reading of the H10 database.

Difference

The changes between the Philips HDD100 and YH-925 format so far are:

Which Filename Use
HDD100 db5000.hdr Database schema
YH-925 PP5000.hdr Database schema
HDD100 db5000.dat Data
YH-925 PP5000.dat Data
HDD100 DB5000_????.IDX Index
YH-925 PP5000_????.idx Index
Which Index Use
@DEV hidden
FNAM name
FPTH path
YH-925 FRMT format
TALB album
TCON genre
TIT2 title
TPE1 artist
HDD100 TRCK tracknumber
HDD100 XSRC source

Olympus m:robe

George Deka emailed to say that his m:robe (he didn't say the model) uses the same storage format and that there are some scripts up on www.mrobe.org—but from a quick glance they look fairly well hidden and require registration and passwords!

Bring on the code

Lets knock up some Python code to have an oogle at the '.hdr' file:

class PPOS_hdr_entry(sdb):
    fields = [('id', int, 4),
              ('type', int, 4),
              ('length', int, 4),
              ('foo5', int, 4),
              ('foo6', int, 4),
              ('indexed', int, 4),
              ('foo7', int, 4),
              ('foo8', int, 4),
              ('idx_filename', unicode, 256)]
    allowed = {'field_type': {1: unicode,
                              2: int}}
class PPOS_hdr_header(sdb):
    fields = [('id',   int, 4),
              ('type', int, 4),
              ('datafile', unicode, 256),
              ('foo4', int, 4),
              ('headerfile', unicode, 256),
              ('foo6', int, 4),
              ('rows', int, 4),
              ('inactive', int, 4),
              ('columns', int, 4) ]
| id | type | datafile               | foo4 | headerfile             | foo6 | rows | inactive | columns |
|  0 |    0 | System\DATA\PP5000.dat |    0 | System\DATA\PP5000.hdr | 1064 |    4 |        0 |      18 |
| id    | type | length | foo5 | foo6 | indexed | foo7 | foo8 | idx_filename                |
| 61441 |    2 |      4 |    0 |    0 |       1 |    0 |    0 | System\DATA\PP5000_@DEV.idx |
| 61442 |    1 |    128 |    0 |    0 |       1 |    0 |    0 | System\DATA\PP5000_FPTH.idx |
| 61443 |    1 |    128 |    0 |    0 |       1 |    0 |    0 | System\DATA\PP5000_FNAM.idx |
| 61450 |    2 |      4 |    0 |    0 |       1 |    0 |    0 | System\DATA\PP5000_FRMT.idx |
| 61445 |    2 |      4 |    0 |    0 |       0 |    0 |    0 |                             |
| 61446 |    2 |      4 |    0 |    0 |       0 |    0 |    0 |                             |
| 61447 |    2 |      4 |    0 |    0 |       0 |    0 |    0 |                             |
|    60 |    1 |     40 |    0 |    0 |       1 |    0 |    0 | System\DATA\PP5000_TPE1.idx |
|    28 |    1 |     40 |    0 |    0 |       1 |    0 |    0 | System\DATA\PP5000_TALB.idx |
|    31 |    1 |     20 |    0 |    0 |       1 |    0 |    0 | System\DATA\PP5000_TCON.idx |
|    46 |    1 |     40 |    0 |    0 |       1 |    0 |    0 | System\DATA\PP5000_TIT2.idx |
|    67 |    2 |      4 |    0 |    0 |       0 |    0 |    0 |                             |
|    78 |    2 |      4 |    0 |    0 |       0 |    0 |    0 |                             |
| 61449 |    2 |      4 |    0 |    0 |       0 |    0 |    0 |                             |
| 57344 |    2 |      4 |    0 |    0 |       0 |    0 |    0 |                             |
| 57345 |    1 |     40 |    0 |    0 |       0 |    0 |    0 |                             |
|   131 |    1 |     10 |    0 |    0 |       0 |    0 |    0 |                             |
|   132 |    1 |     64 |    0 |    0 |       0 |    0 |    0 |                             |
|     0 |    0 |      0 |    0 |    0 |       0 |    0 |    0 |                             |

More fiddling and from that information we can get it to automatically generate a schema for the '.dat' file; this should enable it to work on all players using a similar format as it doesn't have to know the exact details of each one and can use the meta-data from .hdr to calculate them. A major awkward turned out to be the variable lengths strings. Those have to read until you hit an aligned Unicode null terminator (two null bytes in a row).

The really interesting bit is that there are no fields that change to mark whether it is an MP3 file or the proprietary WMA. The 'Format' field which I was expecting to have this is empty, although there is something mysterious in the 14th field. A good test now would be just to cat an Ogg Vorbis file into one of the existing files on the device and confirm whether it's possible to play it.

No such luck. Cat the Vorbis file into both an '.wma' and a '.mp3' just causes the player to skip and refuse to play them. Catting the existing .wmv's and .ogg's between each other doesn't work either... I think there is a bigger issue here and presumbly a checksum or length hiding somewhere. There is a length database above. The following seems to lead in that direction, it changes the length of the track and causes it to stop working.

echo 'hello' >> 'Beethoven's Symphony No. 9 (Scherzo).wma'  # causes file to be skipped

However, doing the following also causes the track to be skipped, even though it doesn't change the length:

dd bs=1 count=613638 if=~/Desktop/Amarantine.ogg of=Beethoven*.wma

Confirm: did 'dd'ing a WMA into a same-size '.mp3' still work?

Time to reboot to MS Windows again, install Napster and try to get an .ogg into it the "proper" way.

In an MS Windows world

I spent about 3 hours massaging Windows again late into the night. There are a number of ways to access the devices.

USB Mass Storage
This works everywhere (if it's not been flashed with the Janus DRM-only firmware). Any files that you drag or copy on the device are just copied, no update to the database listing the songs is performed, so the device will still not 'see' the songs. After you have copied songs directly into System/MUSIC it is possible to run the Samsung Recovery Utiltiy and ask it to Rebuild Database, this will re-scan the meta-data (ID3 Artist/Title/Album/Length) from all the files found in System/MUSIC and make a new database. It claims that it will take a long time as it does every files rather than just updating the meta-data for files that you've added.
Start→My Computer→Right-click on drive Letter→Open as Portable Media Device
This uses a Samsung driver to present a simplified view of the contents of the player in MS Windows Explorer. The view shows only audio files that are registered in the database and does not show any of the file-structure. Effectively it's showing System/MUSIC (and presumbly System/AUDIBLE). I have not worked out whether the driver is sending MTP commands to the device or whether it's modifying the database files on the drive. It's possible that the player only speaks the MTP protocol when reflashed with the [evil] Janus lock-in firmware.
MS Windows Media Player→Portable Music Device
This is a bit crap but presumebly uses the same driver in a generic way.
Napster Client
This has support for the YH-925. I don't know whther it is using the Windows driver or using some built-in method to access the database on the device. The software that was shipped with US version of the device immediately popped a dialogue box saying that you're not allowed to use Napster outside the United States of America ...as it is Unamerican and you'll be arrested. I uninstalled Napster fairly quickly after I'd verified that it didn't do much useful.
Samsung Music Studio (aka Samsung YEPP Player)
This is shipped as a companion program to the player when it is sold in most countries, it's more targetted to the device and has controls for generating suitable playlists. The best way to download this is to go through the sequence: www.samsung.com→Country: Canada→Personal Audio Devices→MP3 Players→YH-925GS→Download Samsung Music Studio. Note that this is different from the Samsung Multimedia Studio that is already shipped on the CD and which is only for uploading/downloading JPEG images to the player for mobile viewing (with the .jpg renamed .jpx) .
Mass storage and Linux/Unix utility
Not done just, but a 5kB Python script is so much nicer than a 15MB bloatware package that does the same and slower. You can see my initial hackings so far in yh-925-db-0.1.py. You might need to modify file as it hunts for 'System.backup/...' and looks in there. It should really take the '.hdr' on the commandline and give you the rest automatically.

No Ogg'ing Success

I believe the hardware is capable of running software that can play Ogg Vorbis. The software for the YH-925 is stored in the firmware file that the device loads on boot (Q: does it reflash itself or just load from the hard-disk?). I cannot work out how to get Oggs registered in the database...

Initally MS Media Player refused to play Ogg files. After I downloaded and install the Win32 vorbis codecs ("DirectShow Filters") it will now play the audio, along with Napster Client which gains the same ability ...and so does the Samsung Music Studio. None of the programs will transfer Ogg files. Media Player says that it can't transfer the file and that it doesn't know how to convert the Ogg into a suitable format. Music Studio says "Your devices does not support this file. Ca..." and helpfully cuts off the end of the error message in a non-scrollable line. Googling for this message turns up nothing.

Renaming an .ogg to .mp3 doesn't fool the system. Renaming the file and placing it into System/MUSIC and rebuilding the database with the Windows-based recovery tool does not fool it either.

I'm puzzled, lets try the Samsung HQ Forum and see if we can get any helpful pointers from there. This is the post I sent asking for instructions.

Firmware

In the meantime, I'm come across a few versions of the firmware; u1.46 is what came on the device (it was exported to Finland). It currently has 1.58EU on it which shows up the Radio feature (but it doesn't seem to work, maybe it doesn't have the chip on the board)?. I've also seem 1.61CA linked to on a webpage but this hasn't been tried on the device. They all need renaming to FW_YH925.mi4 and placing in the root folder, this system then uses the next time it boots.

EU firmware (with Radio support).

Boot Loader

One thing I did find somewhere is a copy of the Bootloader ROM file that contains the base functionality for the gadget, this likely includes FAT filesystem support for finding the firmware image to load for the OS. More importantly, it includes the base USB Mass storage device that allows the system always to be mounted as an external hard-disk. This is named BL_YH925.rom and is 120072 bytes long, to fit in a 1Mbit flash.

This file is definately interesting and we can find lots of strings in it, including the annoying error messasges about not unplugging the system before unmounting from Windows; if I knew how to reflash the device when I'd replace the 'Windows' strings with 'Linux' or something OS-neutral. If it's any help, this boot-loader file was originally found in System/Params/BL_YH925.rom, which is where the Playlists normally are...

Some interesting strings are; AKM Codec Test Started, AKM appear to make the chip that does the actually decoding. 6005 MP3 Player and PRTLPLYR6005 with 5020, internal product name or a reference platform/prototype board from Portalplayer? SYSTEM\FW_YH925.MI4 and SYSTEM\PP5020.BIN, two locations to look for main firmware? SW3 Boot FAT32 HDD, SW5 Run Diagnostics and various references to SW11 and MENU buttons. Question is, which are the numbered buttons, do they even appear on the outside of the case, or are they just purely debugging contacts hidden on the PCB.

Testing, unplugging and freezing

During this process, it has meant unplugging and replugging the player each time; The message saying that you can should 'safely unplug' (unmount) the device always seems to show up, so you'll just have to unplug it anyway and ignore the scary messages. Make sure you do something like pumount usbdisk ; sync so that the filesystem stays sane as VFAT isn't journalled.

The device is also prone to occasional freezes/crashes with the u1.46 firmware that came with it; I haven't observered the 1.58EU firmware for long enough to comment.

I've had to reset the device (by putting a pencil into the resessed reset button on the back) a couple of times after it has locked up. Doing this turns the player off and resets the database. Now none of tracks show up in the menus, presumably because they have been zeroed. The menu Settings->About->Tracks still says 4 which is what Windows left it as.

I think a reset only touches the database and doesn't go trashing the contents of the hard-disk in any way; so your data is probably safe even if it hangs on you. You might need to keep a copy of the meta-data so having a backup-copy of System/DATA stored on the player's harddisk would be useful for when you don't want to rebuild the database.

Samsung Music Studio and drivers

A list of interesting strings from the Windows driver and player; a couple of references to Audible and the YH-820, YH-920 and YH-925.

.mi4 firmware

Apparently a utility called 'ihpfirm' made by somebody called Dave Hooper ('stripwax') will unencypt the firmware image. This is designed for an m68k/Coldfire firmware file and doesn't unpack the arm image out of the box.

Audible .aa format

Audible seems to be a container for four different compression schemes. Mono MP3 being the highest quality and VoiceAge's acelp.net codecs (seem to be related to G.729) for the lower qualities. audiocoding.com, audible.com, digitalpreservation.gov.


Paul Sladen
Last modified: Sat Mar 25 15:11:24 GMT 2006