Media files are big and it doesn't make sense to send the whole file to the client.

  1. If user stops the playback in the middle then rest of the fetched data is wasted and for mobile users that bandwidth can be costly.
  2. User will have to wait for the entire file to download before the media playback can start.
  3. If user wants to play from the middle, the above two issues arises as well.

A better option is to stream the content using streaming protocol. There are various streaming protocols out there, but the most popular is Apple HTTP Live Streaming (HLS). While encoding to this format, a manifest file is created using M3U8 format.

M3U8 is the unicode version of M3U format, which is used to define playlist or to encode media streaming information. We are interested in the later use case. Here is a sample M3U8 file with media streaming information.

#EXTM3U
#EXT-X-VERSION:4
#EXT-X-MEDIA-SEQUENCE:0
#EXT-X-ALLOW-CACHE:YES
#EXT-X-TARGETDURATION:5
#EXTINF:4.040278000000001,
#EXT-X-BYTERANGE:91744@0
ty678.ts
#EXTINF:4.1331549999999995,
#EXT-X-BYTERANGE:92872@91744
ty678.ts
#EXTINF:4.040267,
#EXT-X-BYTERANGE:90616@184616
ty678.ts
#EXTINF:4.017056000000002,
#EXT-X-BYTERANGE:90804@275232
ty678.ts

The above file is referring to the audio file stored at ty678.ts. See how the file points to the byte range of media file at different break points. The interval at which these break points are calculated (4 seconds in this case) can be given as an argument.  

Now, if the user requests to play the file from the middle, the media player can just fetch the byte range for interval around that time and play it, without wasting any bandwidth. Its simple and it works beautifully.

Lets try this with an example on AWS. The following example is implemented using Python and boto3 library

First let's get a boto3 Elastic Transcoder client. You will need AWS access key and secret_key before you begin.

import boto3

AWS_REGION_NAME = "us-east-1"
AWS_ACCESS_KEY_ID = "<your_aws_access_key>"
AWS_SECRET_ACCESS_KEY = "<your_aws_secret_key>"

ts_client = boto3.client(
    'elastictranscoder',
    region_name=AWS_REGION_NAME, 
    aws_access_key_id=AWS_ACCESS_KEY_ID,
    aws_secret_access_key=AWS_SECRET_ACCESS_KEY
)

Next login to AWS console, create an S3 bucket and upload an audio file. You can get Creative Commons audio file for testing from this SoundCloud link.

For this example, I have uploaded a file in bucket named transcoder-test with key uploads/aerocity.mp3

Next, create a Pipeline in Elastic Transcoder on AWS. Its simpler to do this in AWS console. Provide an appropriate name and use the same bucket for Input bucket, Transcoded files and Thumbnails. After the pipeline is created, note down its ID which will be used to run the transcoding tasks. In our case, it is 1553339070816-bei2lb

Next, go to Presets in Elastic Transcoder and get the preset ID for HLS Audio (you can use the appropriate preset according to your requirement). We will be using preset for System preset: HLS Audio - 160k which has ID 1351620000001-200060

Run the following to start a transcoding job using the pipeline which we created earlier.

PIPELINE_ID = "1553339070816-bei2lb"
PRESET_ID = '1351620000001-200060'

SEGMENT_DURATION = '4.0' # segment in seconds
OUTPUT_KEY_PREFIX = "encoded/"


input_key = "uploads/aerocity.mp3"

output_key = "aerocity"
stream_key = "aerocity_stream"



response = ts_client.create_job(
    PipelineId=PIPELINE_ID,
    Input={ 'Key': input_key, },
    Outputs=[{
        'Key': output_key,
        'PresetId': PRESET_ID,
        'SegmentDuration': SEGMENT_DURATION,
    },],
    OutputKeyPrefix=OUTPUT_KEY_PREFIX,
    Playlists=[
        {
            'Name': stream_key,
            'Format': 'HLSv4',
            'OutputKeys': [
                output_key,
            ],
        },
    ],
    UserMetadata={
        # 'string': 'string'
    }
)

job_id = response["Job"]["Id"]

streaming_key = OUTPUT_KEY_PREFIX + output_key + "_v4.m3u8"
transcoded_key = OUTPUT_KEY_PREFIX + output_key + ".ts"

This will start the transcoding job. We can check the status of the job using the following

response = ts_client.read_job(Id=job_id)
status = response["Job"]["Status"]

# Possible states "Submitted", "Progressing", "Complete", "Error"

When the status is Complete the job is done and the files are available on streaming_key and transcoded_key defined above. To test these files, we need to make them public first.

BUCKET_NAME = "transcoder-test"

s3 = boto3.client(
    's3',
    region_name=AWS_REGION_NAME,
    aws_access_key_id=AWS_ACCESS_KEY_ID,
    aws_secret_access_key=AWS_SECRET_ACCESS_KEY
)

s3.put_object_acl(Bucket=BUCKET_NAME, Key=transcoded_key, ACL="public-read")
s3.put_object_acl(Bucket=BUCKET_NAME, Key=streaming_key, ACL="public-read")

# streaming_key  -> encoded/aerocity_v4.m3u8
# transcoded_key -> encoded/aerocity.ts

After making the files public, the streaming file can be accessed at the url https://transcoder-test.s3.amazonaws.com/encoded/aerocity_v4.m3u8 . Change the url by replacing "transcoder-test" with your bucket name.

This file can be directly tested in VLC. Go to Files-> Open Network. Paste the url in the "URL" input box and click "Open". Enjoy the music of your first streaming file. :)