From Gnash Project Wiki
Cygnal is the Gnash Project's Flash Media Server-compatible audio and video server.
Cygnal handles negotiating the copyright metadata exchange, as well as streaming the content. It will need to handle many thousands of simultaneous network connection, and support running on large GNU/Linux clusters. It should support handling multiple streams with differing content, as well as a multicast stream with a single data source.
Due to the patent issues surrounding MP3, and the fact that FLV and ON2 are closed formats, one of the main goals of this project is to support free codes and free protocols as the primary way of doing things. Optionally there will be support for MP3, FLV, and ON2 (VP6 and VP7) when playing existing Flash content. Both FLV and the VP6 & VP7 codecs are included in ffmpeg. Users can use the ffmpeg plugin for Gstreamer to use these proprietary codecs.
How Cygnal Works
The Cygnal networking core is based around message queues for all networking. This is to enable Cygnal to route packets between multiple network connections or RTMP channels.
Buffers are the lowest level form of data storage. To avoid the overhead of an STL vector, and the functionality of vector also isn't needed, a custom Buffer class was created. This is merely a pointer to the data, and a byte count. All packets by default are the same size, which is more efficient when streaming. Packets are resized when needed, but commonly only the network protocol packets need to do this. The default size for a buffer is set in libbase/network.h, and is called NETBUFSIZE.
Data can be added to a buffer, resized, set to zero, and a variety of operations needed to manipulate a memory buffer. Buffers can also be tested against each other, which has a small performance hit as all the data has to be compared as well.
The CQue class us an STL deque containing Buffer objects. This class supports the pushing, popping, and peeking of messages. There is also support for removing messages by address, as well as merging multiple full-size messages into a single one to make them easier to parse. Each que uses mutexes to protect access to the data, as well as a signaling mechanism using Boost condition variables and mutexes.
Each incoming network connection instantiates a Handler class. The handler class is a top level object. When a connection request comes in, a Handler is created, which then creates the two ques, one for each direction. The handler also creates 3 threads. One thread is sleeping until data comes in via the network. It then wakes up, creates a buffer, reads the data into the buffer, and then adds the buffer to the incoming que, and then notifys the processing thread using a conditional variable and a mutex.
An output thread is also created, which is sleeping on it's own condition variable and mutex. When the processing thread has output data, it's put in the outgoing que, and then the output thread writes the buffer to the network after being signaled to wake up.
The third thread sleeps on the incoming threads mutex till there is data in the que. When there is, the thread wakes up to process the data buffers in the que. All I/O is done via the ques.
All remote connections have two possible names. The first one is the name of the application (usually the swf movie name), which is used as a top level sub directory for accessing files from the server. The second name, which is optional, is the name of this instance of the swf movie being played. To connect a stream, you need to be able to access the entire path as specified by the NetConnection object.
When a Handler is created, it is stored in Cygnal's global que of network connections. The names of the application and instance are used as an index into the STL map. When a packet needs to be routed from one que to another, the Handler object for that connection is looked up, and the packet is placed on the proper que. Messages added to another Handler's que go into the incoming que, much as if it was read by the network. This lets transfered objects be multiplexed with other streams.
It should also be possible to route objects between instantiations of Cygnal using existing protocols for this. The purpose would be to let multiple Cygnals running work in a peer to peer manner. This is to enable better distribution of the load for things like video conferencing and video broadcasting. This way multiple clients could be connected to their own local Cygnal, which would then pass the packets to the other Cygnal for the other pool of users.
Using ffmpeg or gstreamer, it is possible to convert between codecs. To better support the adoption of patent free codecs, Cygnal should have support for a filter mechanism to convert a proprietary codec like FLV or MP3 to Ogg Theora or Ogg Vorbis for streaming to a client like Gnash that supports these free codecs. It should then also be possible to convert use Cygnal to convert video from the proprietary swf player to a free codec supported by Gnash.
Cygnal needs to support embedded devices that wish to stream a single movie from disk or a camera, or configured to support a large cluster handling thousands of concurrent connections. Since Cygnal can talk to other instantiations of itself on other networks, it can be used in a peer to peer manner to handle video conversion in a distributed manner rather than through a centralized server.
The most common way of streaming Flash movies is using progressive streaming. Progressive streams don't allow seeking within the data once it is playing. The only control is "play" and "pause". Progressive streaming doesn't even require a Media server, it can be done with any web server, which doesn't do any special handling. This is often referred to as RTMP over HTTP, but there isn't any RTMP protocol used within the client-server communication. The protocol is primarily HTTP using GET and POST directives. RTMPT is just a few additional commands.
A Flash media server has several abilities other than just sending a file stream to the player. Using the RTMP protocol, it is possible to dynamically seek within the stream. This puts the control of the movie on the server side instead of the client side. This is called dynamic streaming. Along with seeking, dynamic streaming also supports capturing a stream, which allows one to send a data stream to the server, where it can stored, and then later replayed.
The server collects statistics on the number of connections, the bandwidth consumed, which files are streamed, the frame rate, etc. This data can be used to tune the performance of the server when running. Some statistics modify the stream while it is playing, and the other are for load balancing on larger installations, like a cluster.
The server can also transcode between codecs. The FMS server can only convert MPEG4.. Since the server is using Gstreamer, it can convert between any supported codec. Along with this, by using the statistics collected for each data transfer, the server can also change the resolution of the movie to adjust to varying network connectivity issues. Included will support for transcodingShoutcast streams using libshout.
Along with progressive and dynamic streaming support, Cygnal also supports multicasting. This way a single source, often a 'real-time" one, like a sports game, can be viewed by multiple clients without duplicating the data. This source can't be seeked, so it functions like a variation of progressive streaming. This can be used for broadcast style video streaming.
There are several utility functions built into the servers as well for navigation purposes. The main one is used to generate thumbnails or a short preview of a movie automatically, without having to do this as part of another operation. These features can also be used to merge clips from multiple movies.. Each clip is played for the specified amount of time, and then the next clip is played starting and stopping at the specified times, and so on until all the clips are played.
Cygnal should also support additional metadata, namely closed captioning and sub titles. Often this metadata is stored as a separate disk based file, and needs to be streamed to the client roughly synchronized with the client. While this isn't supported by the proprietary swf player, it can be supported by Gnash by using an additional RTMP channel or network connection for this additional content. A swf based media player using Gnash would be able to access this additional datastream, to display based on the users preferences.
Cygnal will handle negotiating the Open Rights Management information exchange.
Flash Communication Server features
This is a list of features extracted from the O'Reilly book on the Adobe Flash Media Server (FMS). While not all of these are probably worth implementing in Cygnal, the list is interesting.
- Seeking in downloading movies only works for the cached part that has already been transferred. Seeks to the undownloaded part of a stream are forbidden.
- Video and Audio can be uploaded into a safe sandbox area for later downloading.
- Current FMS supports only server side ActionScript 1, the newer one supports server side ActionScript 2.
- Server can directly connect streams between clients.
- Only supports point to point connections, multicasting isn't supported.
- All server side extensions are written in ActionScript.
- Can upload and store Shared AMF Objects.
- The software license limits the number of permitted connections.
- Tracks statistics on online users.
- Needs separate copies of media, to handle different bandwidth network connections.
- Only dynamically transcodes from mpeg4.
- Audio and data have the highest priority, video packets are thrown away to stay synchronized.
Other Streaming Media Servers
There are several other streaming servers that handle streaming audio and video. Some handle multiple formats, but most have a protocol supported only by that one project (like shoutcast). None but Red5 support Flash, and that feature isn't working yet anyway.
The basic client server communication is:
- Client→Server : sends a CreateStream request ( is it a single RTMP packet ?)
- Server→Client : sends a response with a streamIndex number
- Client→Server : does a publish (what does it means in this context ?)
- Client→Server : send the audio video packets (the packets are sent from the source as indicated on the streamIndex via the same channel as the publish request)
Each netConnection can handle up to 64 AMF objects independent of each other.
HTTP Standards presents and discusses the full and as-precise-as-it-gets definition of HTTP.
HTTP is used when tunneling RTMP over HTTP (port 80) to get around firewalls. The Flash client issues GET requests to have content sent to it by the server. That is documented on the RTMPT page.
To get the file specified by this URL: http://www.foobar.com/path/file.html
- send the GET request to the server
GET /path/file.html HTTP/1.0
- the server responds with this:
HTTP/1.0 200 OK
Date: Fri, 31 Dec 2000 23:59:59 GMT
The "Content-Length" is the byte count of the file that is about to be send to the client. After this response, the file is transmitted.
There are several obscure things I noticed when working on Cygnal's HTTP support. All lines are terminated with a "\r\n" pair, instead of the usual unix "\n". After the headers, there is an additional "\r\n" pair before starting the body of the message.
There is also the issue of the persistence of the network connection, also discussed on the RTMPT page. HTTP 1.0 only supported separate network connections for each request. Forcing a persistent connection was why RTMPT was probably invented. With HTTP 1.1, all HTTP connections became persistent by default. Two new fields, Keep-Alive and Connection were added to toggle off persistence.
From what I see in packet sniffing network connections for apache, Connection can be keep-alive or close, although it's almost always keep-alive. A timeout value can be specified, the default is 15 seconds. Unless the server close the network connection, the browser will never drop the connection, so you get the little spinning hourglass that never goes away.