[dpdk-dev] [PATCH v3 1/3] dfs:add FUSE based filesystem for DPDK

Wiles, Keith keith.wiles at intel.com
Mon Dec 17 16:01:06 CET 2018



> On Dec 17, 2018, at 5:45 AM, Thomas Monjalon <thomas at monjalon.net> wrote:
> 
> Hi Keith,
> 
> 16/12/2018 18:46, Keith Wiles:
>> DFS stands for DPDK Filesystem, which helps expose data
>> and control files in a FUSE based filesystem. The dfs requires
>> libfuse3 and libjansson to be present in the Linux system.
> 
> You presented this idea at the DPDK Summit in Dublin,
> and I have not seen any public discussion about the idea.
> Let's try to debate it here.
> 
>> DFS provides a simplified API on top of the FUSE APIs to
>> simplify creating and using FUSE.
>> 
>> Here is the github repo: https://github.com/akheron/jansson
>> Here is the github repo: https://github.com/libfuse/libfuse
>> 
>> Please use you system updater tool yum, apt-get, ... to add
>> support for these two libraries. Also read the dfs documentation
>> in the docs directory for more details.
> [...]
>> +DPDK File System (DFS)
>> +======================
>> +
>> +This DPDK library provides a pseudo file system much like Linux /proc or /system
>> +file systems, but this one is a userspace file system. Allowing applications to
>> +create and design file system directories and files using the FUSE 3.2
>> +https://github.com/libfuse/libfuse code.
> 
> My first thought is we are missing the problem statement.
> What are you trying to solve? Which use case?

The issue is we do not have a clean way for users and developers to extract information from DPDK or the application for monitoring/control. Using APIs is fine from the application perspective to get information and set information, but the user or admin of a system does not have access to those APIs without a developer giving access via a command line or other method. Each developer creating an application would have to provide this basic level of information via some method in a cloud or VNF system to allow the user or orchestration access. Using DFS the developer would not normally have to write the access methods/display himself, saving him time to develop his application.

A file system is the simplest and easiest method to get host command line access to that information. Control or initialization can also be added at this same simple method for DPDK and the application. Currently I do not have any control support in DFS and it would be reasonable to add these controls (in a limited way) to remove DPDK command line options or cmdline cmds when starting DPDK. Giving some runtime control is better for the application in a cloud or NFV environments.

> 
> In DPDK, we have some run-time options accessible from the command line,
> and some public API functions.

We have cmdline access to these functions and lets face it cmdline is very difficult for most to use :-), but we do not have access from the system level. The APIs in DFS is much easier and cleaner to allow access to the required information. The application developer also can use these APIs to expose this information without having to give some type of cmdline access. The application may not want a cmdline interface (or allowed to give cmdline access), but does want to get access to the information in the application and DPDK. 

Having access via the command line or bash shell is much easier to use and provides simple access by other languages like Go, Python, Bash, Lua … any language that can read or write into a normal filesystem. Then the system level administrator or application developer can write tools in any language he see fit.

> I think it is agreed that the primary interface must be the API functions.
> The command line options should be a responsibility of the application
> to implement them or not.

It is not agreed, customers I have spoke with do not agree that DPDK information must be supplied by the application, it should be the responsibility of DPDK in this case to provide that information in some easy to access method. If the application side wants to provide information or control then the developer can also use DFS or not.
> 
> I think exposing a filesystem tree interface would be also an application
> responsibility. Actually, it could be part of a DPDK application,
> or be run as a secondary process application.

The secondary process method means the user must use the same version of the application and DPDK to access the primary or we can crash the secondary or primary application. It is very easy to use the wrong version of DPDK secondary application and get invalid information as the primary is different is some offset in a structure.

Remember cloud or NFV systems will have a large number of applications and versions. Using a secondary process model is also a bit of a security problem and can cause the primary application to crash if the secondary does something wrong. Also the system would have to provide a different secondary application matching every DPDK/Application on the platform. Having to match up secondary applications to each and every application is a nightmare for the admin or developer.

> It is probably a good idea to implement it as a ready-to-use library.
> As it can be implemented on top of DPDK API, I think it should live
> outside of the main repository. As any other DPDK applications, it will
> require to be updated when the DPDK APIs are updated or extended.
> In my opinion, we should not mandate DPDK contributors to update DFS
> when adding or updating an API. That's why it should not be part of the
> DPDK package.

The only changes from DPDK developers is when a common API in DPDK changes and is used by DFS. It would not be difficult to keep these changes updated as these changes in API do not happen often. Requiring the developer of DPDK to maintain the changes is already required for other parts of DPDK, correct?

Maintaining DFS outside of DPDK is an artificial requirement as it does not placed any more burden on DPDK developers IMO. It does place some one time work on the test system for DPDK to test DFS. The testing system would need to build DPDK with DFS and provide some basic level of verification.

The test system can also use DFS in normal testing to extract DPDK information without having to use expect scripts against the cmdline system, which are very error prone and the syntax of the command could change. If the testing system in DPDK does not do this level of testing I think we should be adding that level to verify DPDK.

> One more argument to keep it outside: for diversity and innovation,
> it is better to not enforce one (new) control interface as the official
> (and suggested) one way.

The current control method does not work for all cases, it works only for the developer. The system admin or the developer must remember cmdline syntaxes, which are difficult to use (Look at testpmd ;-). In some cases using a command line built into the application may not be reasonable in a normal runtime environment. Accessing the filesystem is normal and easy as admin, developer, user and orchestration have access to the filesystem.

> 
> One more question,
> Is there something exposed in DFS that has not a public DPDK API?
> If so, we should try fill the gaps.

Yes, only a couple of debug routine ring and tailq walk routines I included in the patch. I was not focused on adding more APIs to DPDK main code only exposing the information in DPDK using the current APIs. DFS should only use standard DPDK APIs to expose information to the file system and the application can write very simple routines to expose any other information he wants via DFS from his application.

The current exposed APIs in DFS are just the simple stats and information I could collect from DPDK today, but I believe we can expose other information as we define these very quickly.

> 
> 

Regards,
Keith



More information about the dev mailing list