Tuesday, March 28, 2006

Q&A: How OS X Executes Applications

After writing my previous article, How OS X Executes Applications, I received quite a few comments and e-mails with some good questions. I will attempt to answer some of them here, and continue to update this entry as questions arise.

Question 1. What is libSystem.B.dylib?

evil:~/Temp mohit$ otool -L /bin/ls
/bin/ls:
/usr/lib/libncurses.5.4.dylib (compatibility version 5.4.0, current version 5.4.0)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 88.0.0)

The System library, found in /usr/lib/libSystem.dylib, is simply a collection of core libraries that are used by most Darwin applications. A few libraries worth mentioning that are in libSystem.dylib are:
  • libc : The standard C library.
  • libdl : The dynamic loader library.
  • libm : The math library.
  • libpthread : The POSIX threads library.
  • libinfo : The NetInfo library.
To get a complete list of modules and symbols within the library, use the -Tv switch of the otool command.

Question 2. Is there an objdump for OS X?

Yes there is, and it supports Mach-O binaries. It's just not distributed with Darwin / OS X. This link on this site, tells you what systems objdump is distributed with.

Question 3. Are executable code and readonly data in the same _TEXT segment? If so, how can they mark part of it executable and part not executable (normal security practice nowadays)?

I actually updated the article with the answer to this, but its a good question, and I'll answer it here again.

Segments may be sub-divided into sections. Within the __TEXT segment, only certain sections, e.g., __text, or __picsymbol_stub, can contain executable code.

To determine which sections contain executable code, use the -lv parameter with otool, and look at the attribute named attributes.
evil:~/Temp mohit$ otool -lv /bin/ls | egrep '(sectname|attributes)'
sectname __text
attributes PURE_INSTRUCTIONS SOME_INSTRUCTIONS
sectname __picsymbol_stub
attributes PURE_INSTRUCTIONS
sectname __symbol_stub
attributes PURE_INSTRUCTIONS
sectname __picsymbolstub1
attributes PURE_INSTRUCTIONS SOME_INSTRUCTIONS
sectname __cstring
attributes (none)
sectname __symbol_stub1
attributes PURE_INSTRUCTIONS SOME_INSTRUCTIONS
sectname __literal8
attributes (none)
sectname __eh_frame
attributes NO_TOC STRIP_STATIC_SYMS LIVE_SUPPORT
sectname __data
attributes (none)
sectname __nl_symbol_ptr
attributes (none)
sectname __la_symbol_ptr
attributes (none)
sectname __dyld
attributes (none)
sectname __common
attributes (none)
sectname __bss
attributes (none)

The sections with attributes set to PURE_INSTRUCTIONS contain executable code.

Question 4. How do I dechipher the constants in the otool output?

There are two ways to do this: One way is to examine the header files in /usr/include/mach and /usr/include/mach-o; and the other, simpler, way is to just add -v to your otool commands.
evil:~/Temp mohit$ otool -vh /bin/ls      
/bin/ls:
Mach header
magic cputype cpusubtype filetype ncmds sizeofcmds flags
MH_MAGIC PPC ALL EXECUTE 11 1608 NOUNDEFS DYLDLINK TWOLEVEL

Question 5. What are Two-Level Namespaces?

It is a feature included since OS X 10.1, that prevents collisions with symbol names in dynamic libraries. It works by associating library names with symbol names at compile time.

Suppose you have an application that is linked against libfirst and libsecond. libfirst exports a function called dothis(). At a later time, a new version of libsecond comes out with its own dothis() function. Now, the application may execute whichever dothis() function it loads first, which may not be the one that was intended.

With two-level namespaces (enabled by default), the linker associates dothis() with libfirst at compile time. This prevents the chances of symbol collisions in future versions of linked libraries.

Question 6. Is Steve Jobs going to have you executed for reverse engineering this information?

Yes he is.

Seriously though, all this information is public knowledge. I did not "reverse engineeer" anything. All I did was put together the most relevant parts of the documents mentioned at the end of the article. And I would suggest reading them for a deeper understanding of the OS X runtime environments.

Sunday, March 26, 2006

How OS X Executes Applications

Being a long-time UNIX user, I generally have a common set of tools that I work with while trying to troubleshoot system problems. More recently, I have been developing software that adds Apple's OS X to the list of supported operating systems; and unlike traditional UNIX variants, OS X does not support many of the tools that relate to loading, linking and executing programs.

For example, when I come across library relocation problems, the first thing I do is run ldd on the executable. The ldd tool lists the dependent shared libraries that the executable requires, along with their paths if found.

On OS X though, here's what happens when you try to run ldd.
evil:~ mohit$ ldd /bin/ls
-bash: ldd: command not found

Not Found? But it's on all the common UNIX flavours. I wonder if objdump works.
$ objdump -x /bin/ls
-bash: objdump: command not found

Command not found. What's going on?

The problem is that unlike Linux, Solaris, HP-UX, and many other UNIX variants, OS X does not use ELF binaries. In addition, OS X is not part of the GNU project, which is home to tools like ldd and objdump.

In order to get a list of dependencies for an executable on OS X, you need to use otool.
evil:~ mohit$ otool /bin/ls
otool: one of -fahlLtdoOrTMRIHScis must be specified
Usage: otool [-fahlLDtdorSTMRIHvVcXm] object_file ...
-f print the fat headers
-a print the archive header
-h print the mach header
-l print the load commands
-L print shared libraries used
-D print shared library id name
-t print the text section (disassemble with -v)
-p start dissassemble from routine name
-s print contents of section
-d print the data section
-o print the Objective-C segment
-r print the relocation entries
-S print the table of contents of a library
-T print the table of contents of a dynamic shared library
-M print the module table of a dynamic shared library
-R print the reference table of a dynamic shared library
-I print the indirect symbol table
-H print the two-level hints table
-v print verbosely (symbolicly) when possible
-V print disassembled operands symbolicly
-c print argument strings of a core file
-X print no leading addresses or headers
-m don't use archive(member) syntax
evil:~ mohit$ otool -L /bin/ls
/bin/ls:
/usr/lib/libncurses.5.4.dylib (compatibility version 5.4.0, current version 5.4.0)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 88.0.0)

Much better. I can see that /bin/ls references two dynamic libraries. Though, the filename extensions don't look at all familiar.

I'm quite sure that many UNIX / Linux users have had similar experiences while working on OS X systems, so I decided to write a little on what I have learnt so far about OS X executable files.

The OS X Runtime Architecture

A runtime environment is a framework for code execution on OS X. It consists of a set of conventions that define how code is loaded, managed and executed. When an application is launched, the relevant runtime environment loads the program into memory, resolves references to external libraries, and prepares the code for execution.

OS X supports three runtime environments:
  • Dyld Runtime Environment: The preferred runtime environment based on the dyld library manager.
  • CFM Runtime Environment: A legacy environment inherited from OS 9. This is really designed for applications that want to use some of the newer OS X features, but have not been completely ported to dyld yet.
  • The Classic Environment: This environment makes it possible for unmodified OS 9 (9.1 or 9.2) applications to run on OS X.
This article will primarily focus on the Dyld Runtime Environment.

The Mach-O Executable File Format

In OS X, almost all files containing executable code, e.g., applications, frameworks, libraries, kernel extensions etc., are implemented as Mach-O files. Mach-O is a file format and an ABI (Application Binary Interface) that describes how an executable is to be loaded and run by the kernel. To be more specific, it tells the OS:
  • Which dynamic loader to use.
  • Which shared libraries to load.
  • How to organize the process address space.
  • Where the function entry-point is, and more.
Mach-O is not new. It was originally designed by the Open Software Foundation (OSF) for their OSF/1 operating system, which was based on the Mach microkernel. It was later adapted to x86 systems with OpenStep.

To support the Dyld Runtime Environment, all files must be built using the Mach-O executable format.

How Mach-O Files are Organized

Mach-O files are divided into three regions: a header, a load commands region, and the raw segment data. The header and load commands regions describe the features, layout and other characteristics of the file, while the raw segment data region contains ranges of bytes that are referenced by the load commands.

To investigate and examine the various parts of Mach-O files, OS X comes with a useful program called otool located in /usr/bin.

In the following sections, we will use otool to learn more about how Mach-O files are organized.

The Header

To view the the Mach-O header of a file, use the -h parameter of the otool command.
evil:~ mohit$ otool -h /bin/ls
/bin/ls:
Mach header
magic cputype cpusubtype filetype ncmds sizeofcmds flags
0xfeedface 18 0 2 11 1608 0x00000085

The first thing specified in the header is the magic number. The magic number identifies the file as either a 32-bit or a 64-bit Mach-O file. It also identifies the endianness of the CPU that it was intended for. To decipher the magic number, have a look at /usr/include/mach-o/loader.h.

The header also specifies the target architecture for the file. This allows the kernel to ensure that the code is not run on a processor-type that it was not written for. For example, in the above output, cputype is set to 18, which is CPU_TYPE_POWERPC, as defined in /usr/include/mach/machine.h.

From these two entries alone, we can infer that this binary was intended for 32-bit PowerPC based systems.

Sometimes binaries can contain code for more than one architecture. These are known as Universal Binaries, and generally begin with an additional header called the fat_header. To examine the contents of the fat_header, use the -f switch of the otool command.

The cpusubtype attribute specifies the exact model of the CPU, and is generally set to CPU_SUBTYPE_POWERPC_ALL or CPU_SUBTYPE_I386_ALL.

The filetype signifies how the file is to be aligned and used. It usually tells you if the file is a library, a standard executable, a core file etc. The filetype above equates to MH_EXECUTE, which signifies a demand paged executable file. Below is a snip from /usr/include/mach-o/loader.h that lists the different file-types as of this writing.
#define MH_OBJECT 0x1   /* relocatable object file */
#define MH_EXECUTE 0x2 /* demand paged executable file */
#define MH_FVMLIB 0x3 /* fixed VM shared library file */
#define MH_CORE 0x4 /* core file */
#define MH_PRELOAD 0x5 /* preloaded executable file */
#define MH_DYLIB 0x6 /* dynamically bound shared library */
#define MH_DYLINKER 0x7 /* dynamic link editor */
#define MH_BUNDLE 0x8 /* dynamically bound bundle file */
#define MH_DYLIB_STUB 0x9 /* shared library stub for static */
/* linking only, no section contents */

The next two attributes refer to the load commands section, and specify the number and size of the commands.

And finally, we have flags, that specify various features that the kernel may use while loading and executing Mach-O files.

Load Commands

The load commands region contains a list of commands that tell the kernel how to load the various raw segments within the file. They basically describe how each segment is aligned, protected and laid out in memory.

To see a the list of load commands within a file, use the -l switch of the otool command.
evil:~/Temp mohit$ otool -l /bin/ls
/bin/ls:
Load command 0
cmd LC_SEGMENT
cmdsize 56
segname __PAGEZERO
vmaddr 0x00000000
vmsize 0x00001000
fileoff 0
filesize 0
maxprot 0x00000000
initprot 0x00000000
nsects 0
flags 0x4
Load command 1
cmd LC_SEGMENT
cmdsize 600
segname __TEXT
vmaddr 0x00001000
vmsize 0x00006000
fileoff 0
filesize 24576
maxprot 0x00000007
initprot 0x00000005
nsects 8
flags 0x0
Section
sectname __text
segname __TEXT
addr 0x00001ac4
size 0x000046e8
offset 2756
align 2^2 (4)
reloff 0
nreloc 0
flags 0x80000400
reserved1 0
reserved2 0

[ ___SNIPPED FOR BREVITY___ ]

Load command 4
cmd LC_LOAD_DYLINKER
cmdsize 28
name /usr/lib/dyld (offset 12)
Load command 5
cmd LC_LOAD_DYLIB
cmdsize 56
name /usr/lib/libncurses.5.4.dylib (offset 24)
time stamp 1111407638 Mon Mar 21 07:20:38 2005
current version 5.4.0
compatibility version 5.4.0
Load command 6
cmd LC_LOAD_DYLIB
cmdsize 52
name /usr/lib/libSystem.B.dylib (offset 24)
time stamp 1111407267 Mon Mar 21 07:14:27 2005
current version 88.0.0
compatibility version 1.0.0
Load command 7
cmd LC_SYMTAB
cmdsize 24
symoff 28672
nsyms 101
stroff 31020
strsize 1440
Load command 8
cmd LC_DYSYMTAB
cmdsize 80
ilocalsym 0
nlocalsym 0
iextdefsym 0
nextdefsym 18
iundefsym 18
nundefsym 83
tocoff 0
ntoc 0
modtaboff 0
nmodtab 0
extrefsymoff 0
nextrefsyms 0
indirectsymoff 30216
nindirectsyms 201
extreloff 0
nextrel 0
locreloff 0
nlocrel 0
Load command 9
cmd LC_TWOLEVEL_HINTS
cmdsize 16
offset 29884
nhints 83
Load command 10
cmd LC_UNIXTHREAD
cmdsize 176
flavor PPC_THREAD_STATE
count PPC_THREAD_STATE_COUNT
r0 0x00000000 r1 0x00000000 r2 0x00000000 r3 0x00000000 r4 0x00000000
r5 0x00000000 r6 0x00000000 r7 0x00000000 r8 0x00000000 r9 0x00000000
r10 0x00000000 r11 0x00000000 r12 0x00000000 r13 0x00000000 r14 0x00000000
r15 0x00000000 r16 0x00000000 r17 0x00000000 r18 0x00000000 r19 0x00000000
r20 0x00000000 r21 0x00000000 r22 0x00000000 r23 0x00000000 r24 0x00000000
r25 0x00000000 r26 0x00000000 r27 0x00000000 r28 0x00000000 r29 0x00000000
r30 0x00000000 r31 0x00000000 cr 0x00000000 xer 0x00000000 lr 0x00000000
ctr 0x00000000 mq 0x00000000 vrsave 0x00000000 srr0 0x00001ac4 srr1 0x00000000

The above file has 11 load commands located directly below the header, numbered 0 to 10.

The first four commands (LC_SEGMENT), numbered 0 to 3, define how segments within the file are to be mapped into memory. A segment defines a range of bytes in the Mach-O binary, and can contain zero or more sections. We will talk more about segments later.

Load command 4 (LC_LOAD_DYLINKER) specifies which dynamic linker to use. This is almost always set to /usr/lib/dyld, which is the default OS X dynamic library linker.

Commands 5 and 6 (LC_LOAD_DYLIB) specify the shared libraries that this file links against. These are loaded by the dynamic loader specified in command 4.

Commands 7 and 8 (LC_SYMTAB, LC_DYNSYMTAB) specify the symbol tables used by the file and the dynamic linker respectively. Command 9 (LC_TWOLEVEL_HINTS) contains the hint table for the two-level namespace.

And finally, command 10 (LC_UNIXTHREAD), defines the initial state of the main thread of the process. This command is only included in executable files.

Segments and Sections

Most of the load commands mentioned above make references to segments within the file. A segment is a range of bytes within a Mach-O file that maps directly into virtual memory by the kernel and the dynamic linker. The header and load commands regions are considered as the first segment of the file.

An typical OS X executable generally has five segments:
  • __PAGEZERO : Located at virtual memory address 0 and has no protection rights. This segment occupies no space in the file, and causes access to NULL to immediately crash.
  • __TEXT : Contains read-only data and executable code.
  • __DATA : Contains writable data. These sections are generally marked copy-on-write by the kernel.
  • __OBJC : Contains data used by the Objective C language runtime.
  • __LINKEDIT : Contains raw data used by the dynamic linker.
The __TEXT and __DATA segments may contain zero or more sections. Each section consists of specific types of data, e.g., executable code, constants, C strings etc.

To see the contents of a section, use the -s option with the otool command.
evil:~/Temp mohit$ otool -sv __TEXT __cstring /bin/ls
/bin/ls:
Contents of (__TEXT,__cstring) section
00006320 00000000 5f5f6479 6c645f6d 6f645f74
00006330 65726d5f 66756e63 73000000 5f5f6479
00006340 6c645f6d 616b655f 64656c61 7965645f
00006350 6d6f6475 6c655f69 6e697469 616c697a
__SNIP__

To disassemble the __text section, use the -tv switch.
evil:~/Temp mohit$ otool -tv /bin/ls
/bin/ls:
(__TEXT,__text) section
00001ac4 or r26,r1,r1
00001ac8 addi r1,r1,0xfffc
00001acc rlwinm r1,r1,0,0,26
00001ad0 li r0,0x0
00001ad4 stw r0,0x0(r1)
00001ad8 stwu r1,0xffc0(r1)
00001adc lwz r3,0x0(r26)
00001ae0 addi r4,r26,0x4
__SNIP__


Within the __TEXT segment, there are four major sections:
  • __text : The compiled machine code for the executable.
  • __const : General constants data.
  • __cstring : Literal string constants.
  • __picsymbol_stub : Position-independent code stub routines used by the dynamic linker.
This keeps the executable and non-executable code clearly separated within the segment.

Running an Application

Now that we know what a Mach-O file looks like, let us see how OS X loads and runs an application.

When you run an application, the shell first calls the fork() system call. Fork creates a logical copy of the calling process (the shell) and schedules it for execution. This child process then calls the execve() system call providing the path of the program to be executed.

The kernel loads the specified file, and examines its header to verify that it is a valid Mach-O file. It then starts interpreting the load commands, replacing the child process's address space with segments from the file.

At the same time, the kernel also executes the dynamic linker specified by the binary, which proceeds to load and link all the dependent libraries. After it binds just enough symbols that are necessary for running the file, it calls the entry-point function.

The entry-point function is usually a standard function statically linked in from /usr/lib/crt1.o at build time. This function initializes the kernel environment and calls the executable's main() function.

The application is now running.

The Dynamic Linker

The OS X dynamic linker, /usr/lib/dyld, is responsible for loading dependent shared libraries, importing the various symbols and functions, and binding them into the current process.

When the process is first started, all the linker does is import the shared libraries into the address space of the process. Depending on how the program was built, the actual binding may be performed at different stages of its execution.
  • Immediately after loading, as in load-time binding.
  • When a symbol is referenced, as in just-in-time binding.
  • Before the process is even executed, an optimization technique known as pre-binding
If a binding type is not specified, the just-in-time binding is used.

An application can only continue to run when all the required symbols and segments from all the different object files can be resolved. In order to find libraries and frameworks, the standard dynamic linker, /usr/bin/dyld, searches a predefined set of directories. To override these directories, or to provide fallback paths, the DYLD_LIBRARY_PATH or DYLD_FALLBACK_LIBRARY_PATH environment variables can be set a colon-separated list of directories.

Finally

As you can see, executing a process in OS X is a complex affair, and I have tried to cover as much as is necessary for a useful debugging session.

To learn more about Mach-O executables, otool, and the OS X kernel in general, here are a list of references that I would recommend:

Mac OS X ABI Mach-O File Format Reference
Executing Mach-O Files
Overview of Dynamic Libraries
The otool man page
The dyld man page
/usr/include/mach/machines.h
/usr/include/mach-o/loader.h

Updates

2006/03/28 - Looks like this article was Slashdotted and Dugg. It has been slightly modified since, thanks to a few readers who pointed out errors and typos within.

2006/03/28 - I have answered some of your questions and comments regarding this article here: Q&A: How OS X Executes Applications.

Wednesday, March 22, 2006

Using Spotlight from the OS X Commandline

One significant productivity-enhancing feature that arrived with Tiger was Spotlight. On its own, it changed the way some (if not most) Mac users use their desktops. A simple command-space pops up the Spotlight window, where you can enter a query string, and in a matter of seconds, get a list of files matching your query.

Spotlight has many advantages over traditional file-searching tools. For one thing, it's not a tool. It is a complete indexing and search framework that is tightly integrated into the Operating System. In addition to filenames and paths, it also indexes by file metadata and content. So Spotlight returns query results based on what's inside the file.

Spotlight benefits can also be enjoyed on the commandline, and this article explains how you can take full advantage of it from inside the OS X Terminal window.

The Old Way

Most commandline users are familiar with the ubiquitous find command.
evil:~/Desktop mohit$ find / -name '*Rails*'
/Applications/TextMate.app/Contents/SharedSupport/Bundles/Rails.tmbundle
/Applications/TextMate.app/Contents/SharedSupport/Bundles/Rails.tmbundle/Syntaxes/HTML (Rails).plist
/Applications/TextMate.app/Contents/SharedSupport/Bundles/Rails.tmbundle/Syntaxes/Ruby on Rails.plist
/Applications/TextMate.app/Contents/SharedSupport/Bundles/Rails.tmbundle/Syntaxes/SQL (Rails).plist

Find is old-school. You give it a search path, and it begins its search by tediously recursing directories and finding matches to the query string. On even an average-sized filesystem, find can take a frustratingly long time.

Then there's the locate tool. Locate is much faster because it maintains a periodically-updated index of filenames and their locations.
evil:~/Desktop mohit$ locate Rails
/Applications/TextMate.app/Contents/SharedSupport/Bundles/Rails.tmbundle
/Applications/TextMate.app/Contents/SharedSupport/Bundles/Rails.tmbundle/Commands
/Applications/TextMate.app/Contents/SharedSupport/Bundles/Rails.tmbundle/Commands/Open test-case.plist
/Applications/TextMate.app/Contents/SharedSupport/Bundles/Rails.tmbundle/info.plist
/Applications/TextMate.app/Contents/SharedSupport/Bundles/Rails.tmbundle/Snippets
/Applications/TextMate.app/Contents/SharedSupport/Bundles/Rails.tmbundle/Snippets/170 eruby forin.plist

The problem with locate is that the index is not updated dynamically. On OS X systems, it is updated weekly by code residing in /etc/weekly. Also, locate does not index the contents of the files, nor does it know anything about file metadata.

Enter Spotlight

Spotlight consists of a metadata-store and a content index that is dynamically updated by various importer plugins within the system.

The metadata that Spotlight maintains can be very application-specific. For example, images can contain metadata such as, "Dimensions" and "Color Space". Or music files can contain metadata such as, "Genre", "Bit-Rate" or "Encoding".

Spotlight indexes data by way of various importer plugins. These plugins know how to handle various kinds of data, such as iChat Transcripts, iTunes Music, e-mail etc.

Below is a snippet of the top processes on my PowerBook.
 1352 top         10.4%  0:01.41   1    18    22   620K   416K  1.05M  27.0M 
1337 mdimport 0.0% 0:00.45 4 62 55 1.06M 3.98M 3.15M 39.9M
1294 mdimport 0.0% 0:00.21 3 61 46 776K 2.82M 2.28M 38.9M
1283 mdimport 0.0% 0:00.35 3 61 47 748K 3.19M 2.36M 39.4M
1281 lookupd 0.0% 0:00.17 2 34 39 440K 912K 1.20M 28.5M
1258 iTunes 2.4% 2:32.63 4 226 376 17.0M 26.8M 41.7M 227M

Notice the three mdimport processes. The mdimport daemon is responsible for working with the importer plugins and updating the Spotlight index.

Example 1. A Basic Spotlight Query

The commandline version of Spotlight is mdfind. Simply provide your search query as a parameter and let it run.
evil:~/Desktop mohit$ mdfind Rails
/Users/mohit/Documents/Rails4Days.pdf
/Users/mohit/Documents/Agile Development with Rails.pdf
/Users/mohit/Library/Mail/POP-foobar@mail.snip.com/INBOX.mbox/Messages/20455.emlx
/Users/mohit/Local/rails
/opt/local/lib/ruby/gems/1.8/cache/rails-1.0.0.gem
/opt/local/lib/ruby/gems/1.8/gems/rails-1.0.0
/opt/local/lib/ruby/gems/1.8/gems/rails-1.0.0/bin/rails
/opt/local/lib/ruby/gems/1.8/gems/rails-1.0.0/builtin/controllers/rails_info_controller.rb
/opt/local/lib/ruby/gems/1.8/gems/rails-1.0.0/html/index.html
/opt/local/lib/ruby/gems/1.8/gems/rails-1.0.0/html/images/rails.png

The files that are listed also include files with content and metadata that matches the query expression.

Example 2. Limiting Your Search to a Specific Directory

The -onlyin parameter limits the scope of the search to the directory specified.
evil:~/Desktop mohit$ mdfind -onlyin ~/Desktop Rails
/Users/mohit/Desktop/Downloads/Linux/Documents/Work/Verizon Data/Tekelec/Tekelec_Alarm_Docs.pdf
/Users/mohit/Desktop/Projects/Client/nABLE Event Manager - High Level Architecture.doc
/Users/mohit/Desktop/Projects/Client/nABLE EM.doc
/Users/mohit/Desktop/Projects/Client/to-timesheet-2006-01.pdf


Example 3. Displaying File Metadata

Earlier, I mentioned that Spotlight also indexes file metadata. The mdls tool lets you examine the metadata for a specified file.
evil:~/Desktop/Projects/Tierone mohit$ mdls SomeDocument.doc 
nABLE EM.doc -------------
kMDItemAttributeChangeDate = 2006-01-23 08:12:42 -0500
kMDItemAuthors = ("Homer Simpson")
kMDItemContentCreationDate = 2006-01-23 08:12:40 -0500
kMDItemContentModificationDate = 2006-01-23 08:12:40 -0500
kMDItemContentType = "com.microsoft.word.doc"
kMDItemContentTypeTree = ("com.microsoft.word.doc", "public.data", "public.item")
kMDItemDisplayName = "SomeDocument.doc"
kMDItemFSContentChangeDate = 2006-01-23 08:12:40 -0500
kMDItemFSCreationDate = 2006-01-23 08:12:40 -0500
kMDItemFSCreatorCode = 1297307460
kMDItemFSFinderFlags = 0
kMDItemFSInvisible = 0
kMDItemFSIsExtensionHidden = 0
kMDItemFSLabel = 0
kMDItemFSName = "SomeDocument.doc"
kMDItemFSNodeCount = 0
kMDItemFSOwnerGroupID = 20
kMDItemFSOwnerUserID = 501
kMDItemFSSize = 92160
kMDItemFSTypeCode = 1463304782
kMDItemID = 2821259
kMDItemKind = "Microsoft Word document"
kMDItemLastUsedDate = 2006-01-23 08:12:40 -0500
kMDItemTitle = "Document:"
kMDItemUsedDates = (2006-01-23 08:12:40 -0500)

The metadata consists of various attributes specific to the file. These attributes can be used with mdfind to limit the scope of your search.

A good reference for these metadata attributes can be found at the Apple Developer Connection site.

Example 4. Finding Files by a Specific Author

This time, we limit our search to all files by a given author. The attribute we use is kMDItemAuthors.
evil:~ mohit$ mdfind "kMDItemAuthors == '*Homer*'"
/Users/mohit/Documents/SomeDocument.doc
/Users/mohit/Documents/Microsoft User Data/AutoRecovery save of SomeDocument.doc

Notice that the query was double quoted, while the text-pattern was single quoted.

Example 5. Finding Music by Artist

The Spotlight query expressions can be quite sophisticated. It allows for various kinds of conditional operators and patterns. Below, we search for all music by John Scofield.
evil:~ mohit$ mdfind "kMDItemAuthors == 'John Scofield' && kMDItemContentType == 'public.mp3'"
/Users/mohit/Music/iTunes/iTunes Music/John Scofield/A Go Go/07 Green Tea.mp3
/Users/mohit/Music/iTunes/iTunes Music/John Scofield/A Go Go/06 Kubrick.mp3

Great! But I seem to be missing some files. Where are my AACs?

Spotlight organizes ContentTypes within ContentTypeTrees, so in this case, public.mp3 falls under public.audio.

Knowing this, lets refine our search query to include all audio files.

evil:~ mohit$ mdfind "kMDItemAuthors == 'John Scofield' && kMDItemContentTypeTree == 'public.audio'"
/Users/mohit/Music/iTunes/iTunes Music/John Scofield/A Go Go/07 Green Tea.mp3
/Users/mohit/Music/iTunes/iTunes Music/John Scofield/A Go Go/06 Kubrick.mp3
/Users/mohit/Music/iTunes/iTunes Music/John Scofield/That's What I Say_ John Scofield Plays The Music of Ray Charles/01 Busted.m4a
/Users/mohit/Music/iTunes/iTunes Music/John Scofield/That's What I Say_ John Scofield Plays The Music of Ray Charles/02 What'd I Say.m4a

Much better. But how did I know what to search for?

This is where mdls comes in handy again.

evil:~ mohit$ mdls "/Users/mohit/Music/iTunes/iTunes Music/John Scofield/A Go Go/02 Chank.mp3"
/Users/mohit/Music/iTunes/iTunes Music/John Scofield/A Go Go/02 Chank.mp3 -------------
kMDItemAlbum = "A Go Go"
kMDItemAttributeChangeDate = 2005-11-26 22:00:00 -0500
kMDItemAudioBitRate = 128
kMDItemAudioChannelCount = 2
kMDItemAudioSampleRate = 44100
kMDItemAuthors = ("John Scofield")
kMDItemComment = "Created by Grip"
kMDItemContentCreationDate = 2003-10-28 20:34:30 -0500
kMDItemContentModificationDate = 2003-10-28 20:34:35 -0500
kMDItemContentType = "public.mp3"
kMDItemContentTypeTree = (
"public.mp3",
"public.audio",
"public.audiovisual-content",
"public.data",
"public.item",
"public.content"
)
kMDItemDisplayName = "02 Chank.mp3"
kMDItemDurationSeconds = 406
kMDItemFSContentChangeDate = 2003-10-28 20:34:35 -0500
kMDItemFSCreationDate = 2003-10-28 20:34:30 -0500
kMDItemFSCreatorCode = 0
kMDItemFSFinderFlags = 0
kMDItemFSInvisible = 0
kMDItemFSIsExtensionHidden = 0
kMDItemFSLabel = 0
kMDItemFSName = "02 Chank.mp3"
kMDItemFSNodeCount = 0
kMDItemFSOwnerGroupID = 20
kMDItemFSOwnerUserID = 501
kMDItemFSSize = 6511095
kMDItemFSTypeCode = 0
kMDItemID = 208222
kMDItemKind = "MP3 Audio File"
kMDItemLastUsedDate = 2003-10-28 20:34:35 -0500
kMDItemMediaTypes = (Sound)
kMDItemMusicalGenre = "Jazz"
kMDItemRecordingYear = 1998
kMDItemTitle = "Chank"
kMDItemTotalBitRate = 128
kMDItemUsedDates = (2003-10-28 20:34:35 -0500)

Looking at kMDContentTypeTree, we can tell that public.mp3 falls under public.audio.

We could have also searched by kMDItemMediaTypes, or kMDItemKind, or even a '*mp3' pattern in kMDItemDisplayName.

Example 6. Finding Other Content

You can find images by querying for files with kMDContentTypeTree set to public.image.
$ mdfind "kMDItemContentTypeTree == 'public.image'"

How about we refine that to only images within our iPhoto library.
$ mdfind -onlyin ~/Pictures "kMDItemContentTypeTree == 'public.image'"

Much Better.

Looking for Word documents?
$ mdfind "kMDItemContentType == 'com.microsoft.word.doc'"

Or maybe just PDFs?
$ mdfind "kMDItemContentType == 'com.adobe.pdf'"

Or Both?
$ mdfind "kMDItemContentType == 'com.microsoft.word.doc' || kMDItemContentType == 'com.adobe.pdf'"

Lets stick to plain-text.
$ mdfind "kMDItemContentTypeTree == 'public.text"


Example 7. Looking for Source Code

Finding all Ruby scripts.
$ mdfind "kMDItemContentType == 'public.ruby-script'"

Finding all kinds of scripts (Python, Bash, Ruby etc.)
$ mdfind "kMDItemContentTypeTree == 'public.shell-script'"

Finding everything except Python scripts.
$ mdfind "kMDItemContentTypeTree == 'public.shell-script' && kMDItemContentType != 'public.python-script'"

Finding Source Code (not scripts).
$ mdfind "kMDItemContentTypeTree == 'public.source-code"


Example 8. Using "kind" Keywords (Added 24/Mar/06)

Commandline Spotlight also supports the "kind:" keyword. This is simpler than filtering with kMDItemContentType.
evil:/ mohit$ mdfind "kind:pdf Calculus"
/Users/mohit/Documents/Elementary Calculus.pdf
/Users/mohit/.Trash/marktoberdorf.pdf
/Users/mohit/.Trash/FoundInfsmlCalc.pdf

Spotlight "kind" Keyword list.
Applications  kind:application, kind:applications, kind:app
Contacts kind:contact, kind:contacts
Folders kind:folder, kind:folders
Email kind:email, kind:emails, kind:mail message, kind:mail messages
iCal Events kind:event, kind:events
iCal To Dos kind:todo, kind:todos, kind:to do, kind:to dos
Images kind:image, kind:images
Movies kind:movie, kind:movies
Music kind:music
Audio kind:audio
PDF kind:pdf, kind:pdfs
Preferences kind:system preferences, kind:preferences
Bookmarks kind:bookmark, kind:bookmarks
Fonts kind:font, kind:fonts
Presentations kind:presentations, kind:presentation


Example 9. Using "date" Keywords (Added 24/Mar/06)

Files can also be filtered based on date related information.
evil:/ mohit$ mdfind "kind:pdf date:this week"
/Users/mohit/Desktop/chapter_1a.pdf
/Users/mohit/Documents/Elementary Calculus.pdf
/Users/mohit/Desktop/13.pdf
/Users/mohit/Desktop/Internet_map_labels.pdf

The date ranges that can be specified are:
  • date:this month
  • date:this week
  • date:this year
  • date:today
  • date:yesterday
  • date:tomorrow
  • date:next month
  • date:next week
  • date:next year

Note that the future ranges (tomorrow, next week, etc.) are for Calendar appointments.

Finally

As you can see, Spotlight is great for commandline junkies too. It is a fast, flexible alternative to the UNIX find command, and in many respects, more powerful than find.

But it is by no means a replacement. There are some things that Spotlight's mdfind just cannot do. UNIX find has a much richer set of options, and when it comes to digging deep into the system, there is no alternative.

For most purposes though, Spotlight works very well. Rewriting your shell scripts to use mdfind instead of find, will make them far more responsive (and far less portable). So here's another case where OS X's UNIX underpinnings have made for a useful tool that is usable, both from the GUI and from the Commandline.

Sunday, March 12, 2006

Troubleshooting UNIX Systems with Lsof

One of the least-talked-about tools in a UNIX sysadmin's toolkit is lsof. Lsof lists information about files opened by processes. But that's really an understatement.

Most people forget that, in UNIX, (almost) everything is a file. The OS makes hardware available to applications by way of files in /dev. Kernel, system, memory, device etc. information in made available inside files in /proc. TCP/UDP sockets are sometimes represented internally as files. Even directories are really just files containing other filenames.

Lsof works by examining kernel data-structures and provides a variety of information related to files, pipes, sockets and more.

Lsof is installed by default on most Linux distributions, BSD distributions and OS X. Binary packages for Solaris, AIX, HP-UX, *cough*SCO OpenServer*cough* and many other UNIXes (Unices?) are available on the web.

So, just how useful is lsof?

Deciphering its Output

Switch to root, and type lsof on the commandline.
linux# lsof
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
init 1 root cwd DIR 3,65 4096 2 /
init 1 root rtd DIR 3,65 4096 2 /
init 1 root txt REG 3,65 29556 172317 /sbin/init
init 1 root mem REG 3,65 1166880 93908 /lib/libc-2.3.5.so
init 1 root mem REG 3,65 103053 93909 /lib/ld-2.3.5.so
init 1 root 10u FIFO 3,65 48438 /dev/initctl
ksoftirqd 2 root cwd DIR 3,65 4096 2 /
ksoftirqd 2 root rtd DIR 3,65 4096 2 /
ksoftirqd 2 root txt unknown /proc/2/exe
events/0 3 root cwd DIR 3,65 4096 2 /
events/0 3 root rtd DIR 3,65 4096 2 /
events/0 3 root txt unknown /proc/3/exe

...SNIP...

syslog-ng 6529 root txt REG 3,69 114132 84690 /usr/sbin/syslog-ng
syslog-ng 6529 root mem REG 3,65 1166880 93908 /lib/libc-2.3.5.so
syslog-ng 6529 root mem REG 3,65 64568 93943 /lib/libresolv-2.3.5.so
syslog-ng 6529 root mem REG 3,65 75176 93924 /lib/libnsl-2.3.5.so
syslog-ng 6529 root mem REG 3,65 103053 93909 /lib/ld-2.3.5.so
syslog-ng 6529 root 0u CHR 1,3 47320 /dev/null
syslog-ng 6529 root 1u CHR 1,3 47320 /dev/null
syslog-ng 6529 root 2u CHR 1,3 47320 /dev/null
syslog-ng 6529 root 3u unix 0xdea00e00 672127 /dev/log

...SNIP...

asterisk 7001 root 10u IPv4 6015 TCP localhost:5038 (LISTEN)
asterisk 7001 root 11r FIFO 3,70 306 /var/run/asterisk/autod
ial.ctl
asterisk 7001 root 12u IPv4 6834 UDP *:5060
asterisk 7001 root 13r FIFO 0,5 6019 pipe
asterisk 7001 root 14u IPv4 6016 TCP localhost:5038->localho
st:32768 (ESTABLISHED)
asterisk 7001 root 15u IPv4 6835 UDP *:2727
asterisk 7001 root 16u IPv4 6861 UDP *:4569
asterisk 7001 root 17u REG 3,70 0 593222 /var/lib/asterisk/astdb
asterisk 7001 root 18r FIFO 0,5 6883 pipe
asterisk 7001 root 19u REG 3,70 39402 32066 /var/tmp/iaxy.bin-19098
89093 (deleted)
asterisk 7001 root 20w FIFO 0,5 6883 pipe

...LOTS MORE SNIPPED...

What you will be presented with is a very long list of open files, which you might want to pipe through your favourite pager.

By default (on Linux), lsof displays the following information about each open file:

  • COMMAND: The name of the UNIX command associated with the process.

  • PID: The Process ID.

  • USER: The user ID or login name of the user to whom the process belongs.

  • FD: The file descriptor number of the file or a code representing more information about the structure. See manual page for details.

  • TYPE: The type of the node associated with the file. E.g. REG signifies a regular file, IPv4 or IPv6 signifies an IP socket, DIR a directory, "unix" a UNIX domain socket, etc.

  • DEVICE: Usually contains major and minor device numbers for the files, or addresses/references for other structures.

  • SIZE: The size of the file or the file offset, in bytes. (If available.) In the case of files that don't have true sizes (eg., sockets, pipes), lsof displays the size of the content their kernel buffer descriptors.

  • NODE: Node number / inode / Internet protocol type (TCP) etc.

  • NAME: The name of the file / mount point / device / Internet address / etc.

For a comprehensive description of these fields, refer the lsof manual page.

Since lsof works by examining kernel memory, you will need root access to be able to fully utilize it. A non-root user will not have access to information that belongs to other users.

Common Usage

Lsof is usually run with one or more of the following options:

  • /path/to/file: List processes, owners and open file descriptors that are currently using the specified file.

  • -i [46][protocol][@hostname|hostaddr][:service|port]: List Internet files / sockets.

  • -u name: List files owned by user.

  • -p pid: List files open by specified process.

  • -t: Terse output. No headers, only PIDs. Useful within scripts.

  • -n: Disable resolving of network names.

  • -N: List NFS files

These options are ORed by default.

Display all internet files OR files opened by user "foobar".
# lsof -u foobar -i

To display all internet files that are opened by foobar, you need to apply the AND (-a) condition between the switches.
# lsof -u foobar -a -i


The following recipes demonstrate how lsof can be used to troubleshoot real-world problems.

Recipe #1: Finding Port Hogs

Your web-server is refusing to come up because port 80 is in use by another process. How do you track down the offending process?

# lsof -i

... SNIP ...

asterisk 7554 root 16u IPv4 6861 UDP *:4569
postmaste 7688 postgres 5u IPv4 5955 UDP localhost:32768->localhost:32768
postmaste 7689 postgres 5u IPv4 5955 UDP localhost:32768->localhost:32768
sshd 27038 root 3u IPv4 677971 TCP reddwarf:ssh->CPE.xxxx.com:61702 (ESTABLISHED)
sshd 27043 mohit 3u IPv4 677971 TCP reddwarf:ssh->CPE.xxxx.com:61702 (ESTABLISHED)

... SNIP ...

Nice. A list of open Internet sockets, along with the processes, addresses and owners. Also note that (similar to netstat), the TCP states are displayed. Above, we can see two established ssh sessions in progress.

Let's add a port filter and find exactly what we're looking for.

# lsof -i TCP:80
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
lighttpd 7356 lighttpd 3u IPv4 6409 TCP *:http (LISTEN)

Okay, so lighttpd is the reason why Apache won't run. That's probably a good thing.

Recipe #2: Finding Processes Within a Given Port Range

You need to find a range of free ports for your new multimedia application.

# lsof -i TCP:5000-5200
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
asterisk 7001 root 10u IPv4 6015 TCP localhost:5038 (LISTEN)
asterisk 7001 root 14u IPv4 6016 TCP localhost:5038->localhost:32768 (ESTABLISHED)
asterisk 7002 root 10u IPv4 6015 TCP localhost:5038 (LISTEN)
asterisk 7002 root 14u IPv4 6016 TCP localhost:5038->localhost:32768 (ESTABLISHED)
asterisk 7039 root 10u IPv4 6015 TCP localhost:5038 (LISTEN)
asterisk 7039 root 14u IPv4 6016 TCP localhost:5038->localhost:32768 (ESTABLISHED)
asterisk 7040 root 10u IPv4 6015 TCP localhost:5038 (LISTEN)
asterisk 7040 root 14u IPv4 6016 TCP localhost:5038->localhost:32768 (ESTABLISHED)
asterisk 7041 root 10u IPv4 6015 TCP localhost:5038 (LISTEN)
asterisk 7041 root 14u IPv4 6016 TCP localhost:5038->localhost:32768 (ESTABLISHED)
asterisk 7042 root 10u IPv4 6015 TCP localhost:5038 (LISTEN)
asterisk 7042 root 14u IPv4 6016 TCP localhost:5038->localhost:32768 (ESTABLISHED)
asterisk 7044 root 10u IPv4 6015 TCP localhost:5038 (LISTEN)
asterisk 7044 root 14u IPv4 6016 TCP localhost:5038->localhost:32768 (ESTABLISHED)
perl 7046 root 3u IPv4 6054 TCP *:5100 (LISTEN)
perl 7046 root 4u IPv4 6055 TCP *:5101 (LISTEN)
perl 7046 root 6u IPv4 6056 TCP localhost:32768->localhost:5038 (ESTABLISHED)
asterisk 7073 root 10u IPv4 6015 TCP localhost:5038 (LISTEN)
asterisk 7073 root 14u IPv4 6016 TCP localhost:5038->localhost:32768 (ESTABLISHED)
asterisk 7504 root 10u IPv4 6015 TCP localhost:5038 (LISTEN)
asterisk 7504 root 14u IPv4 6016 TCP localhost:5038->localhost:32768 (ESTABLISHED)

Recipe #3: Listing User Files

What files do users "foobar" and "apache" have open?
# lsof -u foobar,apache

List UDP ports in use by user "mohit".
# lsof -i UDP -a -u mohit

Who's responding to "who"?
# lsof -i UDP:who


Recipe #4: Unmounting a Disk or Filesystem

Sometimes you need to track down the user or process that's blocking you from unmounting a disk.
# umount /opt
umount: /opt: device is busy
umount: /opt: device is busy
# mount | grep "/opt"
/dev/hdb9 on /opt type ext3 (rw,noatime)
# lsof /dev/hdb9
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
perl 7046 root 2w REG 3,73 111 1376386 /opt/local/paynacea/var/state/callmanager.pid.err
perl 7046 root 5w REG 3,73 6783 1376385 /opt/local/paynacea/var/log/callmanager.log
# kill 7046
# umount /opt

Or the simpler:
# kill `lsof -t /opt`


Recipe #5: Finding Device Hogs

Who's using the audio manager?
# lsof /dev/audio

Why can't I start my alternate logger?
# lsof /dev/log

Why doesn't my CD eject?
# lsof /dev/cdrom


Recipe #6: Using Exclusions

The '^' (negated) modifier can prefix the User or Process ID parameters to exclude them from the resulting list. Since they represent exclusions, they are applied without ORing or ANDing and take effect before any other selection criteria are applied.

List all Internet files/sockets open by non-root users.
# lsof -i -u^root


Recipe #7: Recursing Directories

The '+D' option causes lsof to search for open files within a specified directory, recursing down to its complete depth.

List all processes that have files open in /tmp.
# lsof +D /tmp

The '+d' option does the same thing, but does _not_ descend the directory tree.

Recipe #8: Matching by Process Name

List all files open by processes beginning with the letters mpg.
# lsof -c mpg


Using a regular-expression.
# lsof -c '/post.*er/'


Recipe #9: Examining Suspicious Processes

Lsof can be used along with strace to examine and monitor the operation of viruses, worms or spyware.

What files are opened by PID 14554?
# lsof -p 14554


Who's looking at the password file?
# lsof /etc/passwd


Recipe #10: Repeat Mode

The -r switch puts lsof in repeat mode. It delays every 15 seconds (unless specified), and displays another listing.

Watching a user's open files every 5 seconds:
 # lsof -u badcop -r5 

Monitoring the password file:
# lsof /etc/passwd -r 2


Recipe #11: Finding Deleted Open Files

This recipe was added on 26/Mar/06 after an anonymous poster left a comment regarding deleted files.


One of the most annoying problems is a file-system quickly running out of space, without a hint of what file is responsible for it. This happens when a file (usually a log-file), gets deleted while it's still being written to. When you delete an open file, the kernel unlinks the file from the directory, but cannot remove the inode, since it's still open.

This causes the file to continue to grow, with no trace of its existance anywhere. Well... almost anywhere.

Lsof provides the +L parameter to list the number of link counts an open file has. When followed by a number, lsof only displays files with link counts less thatn the specified number.
mohit@reddwarf ~ $ lsof +L3
COMMAND PID USER FD TYPE DEVICE SIZE NLINK NODE NAME
sshd 11540 mohit mem REG 3,69 303448 1 85869 /usr/sbin/sshd
sshd 11540 mohit mem REG 3,65 35404 1 94075 /lib/libnss_nis-2.3.5.so
sshd 11540 mohit mem REG 3,65 30928 1 94086 /lib/libnss_compat-2.3.5.so
sshd 11540 mohit mem REG 3,65 35236 1 93958 /lib/libnss_files-2.3.5.so
sshd 11540 mohit mem REG 3,65 28444 1 94094 /lib/libcrack.so.2.8.0

A deleted file has zero links. So the following command displays deleted-but-open files on a system.
$ lsof +L1

Display a list of deleted-but-open files within a specific filesystem.
$ lsof +aL1 /tmp


Finally

We barely scratched the surface with the above recipes, but as you can see, lsof is a powerful troubleshooting tool. I'd be interested in learning what other users do with lsof. Toy with it, tinker with it, use it and let me know how it has helped you.

Thursday, March 09, 2006

Tailing Multiple Log Files with Twisted

In my last entry, I described a mechanism to follow a growing log file using the Twisted framework. Now, what if you wanted to follow multiple log files simultaneously?

You could try something like this:
#!/usr/bin/env python

from twisted.internet import reactor
from Support.followtail import FollowTail

def lineReceived_file1( line ):
print "FILE1: " + line

def lineReceived_file2( line ):
print "FILE2: " + line

if __name__ == "__main__":
tailer1 = FollowTail( "/home/mohit/var/log/access-log" )
tailer2 = FollowTail( "/home/mohit/var/log/error-log" )

tailer1.lineReceived = lineReceived_file1
tailer2.lineReceived = lineReceived_file2

tailer1.start()
tailer2.start()

reactor.run()

Sure, that works fine. But there's a scaling issue above. What if you wanted to follow an unknown number of files, and track which file each line came from? Since a function callback is required for each file, it's quite difficult to use FollowTail for more "dynamic" applications.

Here's a simple solution: Subclass FollowTail, and override lineReceived to have it supply the filename along with the line.
#!/usr/bin/env python

from twisted.internet import reactor
from Support.followtail import FollowTail

import sys

class MultiFollowTail( FollowTail ):
def lineReceived( self, line ):
self.onFileLine( self.filename, line )

def onFileLine( self, filename, line ):
pass

def lineReceived( filename, line ):
print filename + ": " + line

if __name__ == "__main__":
for arg in sys.argv[1:]:
print "Monitoring: " + arg
tailer = MultiFollowTail( arg )
tailer.onFileLine = lineReceived
tailer.start()

if len( sys.argv ) > 1:
reactor.run()

Above, we created a new MultiFollowTail to do just that. Since the filename is included with each callback, we only need a single function to monitor multiple files.

And there you have it, a simple easy-to-use mechanism to monitor multiple logfiles with Python.

Tuesday, March 07, 2006

Following a Log File with Twisted

Another POE component that I dearly miss in Twisted is POE::Wheel::FollowTail. FollowTail lets you monitor a growing log file, spewing events when new lines are received. It also gracefully handles log-rotated files by reopening it when it resets.

After some searching, I found a post in the Kragen-hacks mailing list that does something similar to the POE wheel.

I took that code, added line-buffering, wrapped it in a class, and added events for errors and file resets.

Here's the result:
# Twisted FollowTail
# Mohit Muthanna
#
# A Twisted version of POE::Wheel::FollowTail. Adapted from
# a post by Kragen Sitaker on the Kragen-hacks mailing list.
#
# http://lists.canonical.org/pipermail/kragen-hacks/2005-June/000413.html

from twisted.internet import reactor
from twisted.protocols import basic
import os, stat

class FollowTail:
from os import linesep as newline
__line_buffer = ""

def __init__( self, filename = None, seekend = True, delay = 1 ):
self.filename = filename
self.delay = delay
self.seekend = seekend
self.keeprunning = False

def fileIdentity( self, struct_stat ):
return struct_stat[stat.ST_DEV], struct_stat[stat.ST_INO]

def start( self ):
self.keeprunning = True
self.followTail()

def stop( self ):
self.keeprunning = False

def followTail( self, fileobj = None, fstat = None ):
if fileobj is None:
fileobj = open( self.filename )
if self.seekend: fileobj.seek( 0, 2 )

line = fileobj.read()

if line: self.dataReceived( line )

if fstat is None: fstat = os.fstat( fileobj.fileno() )

try: stat = os.stat( self.filename )
except: stat = fstat

if self.fileIdentity( stat ) != self.fileIdentity( fstat ):
fileobj = open( self.filename )
fstat = os.fstat( fileobj.fileno() )
self.fileReset()


if self.keeprunning:
reactor.callLater( self.delay, lambda: self.followTail( fileobj, fstat ) )

def dataReceived( self, data ):
# Fill buffer
self.__line_buffer += data

# Split lines
lines = self.__line_buffer.splitlines()

if not data.endswith( self.newline ):
self.__line_buffer = lines.pop()
else:
self.__line_buffer = ""

for line in lines:
self.lineReceived( line )

def lineReceived( self, line ):
"""Override This"""

def fileReset( self ):
"""Override This"""

Usage is quite straightforward. The FollowTail constructor has three parameters:

  • filename: Full path to file.

  • seekend: If set to "False", starts from beginning of file. Default "True".

  • delay: How often (in seconds) the file should be polled. Default 1 second.


When a new line is received, FollowTail calls the method "lineReceived" and supplies the line string. If the file has been rotated, it calls "fileReset". Both methods can be overridden in your program.

Here's some sample code that uses this class.
#!/usr/bin/env python

from twisted.internet import reactor
from followtail import FollowTail

def onLine( line ):
print "Line: " + line

def onReset():
print "File Rotated."

filename = "/home/mohit/var/log/access-log"
tailer = FollowTail( filename )
tailer.lineReceived = onLine
tailer.resetFile = onReset
tailer.start()

reactor.run()

Download: followtail.py.

Friday, March 03, 2006

Line Filtering With Twisted

Unlike POE, the Twisted framework does not allow for generic stream filters.

This means that in order to pre- or post-process certain data-streams, one cannot reuse processing code used by other classes. I stumbled across this while trying to parse the output of a running process.

Take the following code for example. It listens on a socket, and when a connection is received, it spawns a process and redirects its output to the socket.
# test1.py
from twisted.internet import protocol, reactor
import sys

class TestProcessProtocol( protocol.ProcessProtocol ):
def outReceived( self, data ):
self.lineReceived( data )

def lineReceived( self, line ):
"""Override This"""

class AdminServerProtocol( protocol.Protocol ):
def connectionMade( self ):
p = TestProcessProtocol()
p.lineReceived = self.lineReceived
p.outConnectionLost = self.transport.loseConnection
reactor.spawnProcess( p, "./sample_module.py", ["sample_module.py"] )

def lineReceived( self, line ):
self.transport.write( "GOT LINE: " + line )

f = protocol.Factory()
f.protocol = AdminServerProtocol

reactor.listenTCP( 1079, f )
reactor.run()

Here's the spawned program.
# sample_module.py

import sys

for y in range(5):
print "line 1"
print "line 2"
print "line 3"
print "line 4"

# Flush STDOUT. Added for clarity.
sys.stdout.flush()

Let's see what happens when I run test1.py and connect to it.
$ telnet localhost 1079
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
GOT LINE: line 1
line 2
line 3
line 4
GOT LINE: line 1
line 2
line 3
line 4
GOT LINE: line 1
line 2
line 3
line 4
GOT LINE: line 1
line 2
line 3
line 4
GOT LINE: line 1
line 2
line 3
line 4
Connection closed by foreign host.

Hmmm... not quite what we expected. It would be nice to receive the data line-by-line instead of the whole buffer when it's flushed. Now, we could modify the spawned process to flush at every line, but sometimes we don't have that luxury. Also, if the process was rapidly spewing out data, or very long lines, we could end-up with partial (incomplete) lines.

Unlike POE, we can't just apply a filter to the stream and let loose. And unfortunately, protocol.LineReceiver can't help use because it only works with sockets.

The solution would be to subclass ProcessProtocol, and use that as our parent class.
from twisted.internet import protocol, reactor
import sys

class LineProcessProtocol( protocol.ProcessProtocol ):
from os import linesep as newline
__out_line_buffer = ""

def outReceived( self, data ):
# Fill buffer
self.__out_line_buffer += data

# Split lines
lines = self.__out_line_buffer.splitlines()

if not data.endswith( self.newline ):
self.__out_line_buffer = lines.pop()
else:
self.__out_line_buffer = ""

for line in lines:
self.lineReceived( line )

def lineReceived( self, line ):
"""Override This"""

def writeLine( self, data ):
self.transport.write( data + self.newline )


class TestProcessProtocol( LineProcessProtocol ):
pass

class AdminServerProtocol( protocol.Protocol ):
def connectionMade( self ):
p = TestProcessProtocol()
p.lineReceived = self.lineReceived
p.outConnectionLost = self.transport.loseConnection
reactor.spawnProcess( p, "./sample_module.py", ["sample_module.py"] )

def lineReceived( self, line ):
self.transport.write( "GOT LINE: " + line + "\n")

f = protocol.Factory()
f.protocol = AdminServerProtocol

reactor.listenTCP( 1079, f )
reactor.run()

Above, we inheret LineProcessProtocol from ProcessProtocol, and create a string-buffer. Everytime data is received, we fill the string buffer and split up the lines. We then feed any complete lines to lineReceived, while leaving only incomplete lines in the string buffer.

Let's see what we get.
$ telnet localhost 1079
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
GOT LINE: line 1
GOT LINE: line 2
GOT LINE: line 3
GOT LINE: line 4
GOT LINE: line 1
GOT LINE: line 2
GOT LINE: line 3
GOT LINE: line 4
GOT LINE: line 1
GOT LINE: line 2
GOT LINE: line 3
GOT LINE: line 4
GOT LINE: line 1
GOT LINE: line 2
GOT LINE: line 3
GOT LINE: line 4
GOT LINE: line 1
GOT LINE: line 2
GOT LINE: line 3
GOT LINE: line 4
Connection closed by foreign host.

Perfect! We can now use our LineProcessProtocol instead of the stock ProcessProtocol to parse process output.

Now the question is, how easy is it to use Twisted's component architecture to create an interface (ILineReceiver), and adapters for the various classes?

Or better still, how easy would it be to implement a generic stream framework within Twisted, making all this unnecessary, while still enjoying some POE goodness?

Thursday, March 02, 2006

Gentoo Emerge Issues

Last night I was unfortunate enough to break one of my staging servers during an emerge world.

Access Violations

First was the dreaded "ACCESS VIOLATION SUMMARY" error. Portage builds packages inside a virtual sandbox, and uses it to ensure that ebuilds don't do anything bad.

That said, sometimes not-fully-tested ebuilds get into the portage tree and make life a little more complicated.

The right thing to do would be to file a bug-report, with details related to you environment, packages, etc. etc. But then, some of us just want to get things done a little quicker.


$ FEATURES="-sandbox" emerge application


The above command temporarily disables the sandbox during the emerge. Do NOT do this with emerge world, as at that point, you'll just be testing fate.

What you need to do is, first proceed with a normal emerge. Go through the logs and find out why exactly the access-violations occured (chmod, unlink etc.) If you're comfortable with what you see then disable the sandbox for the broken application only.

Masking Packages

Portage also tried to upgrade postgres from v7 to v8. No no. This won't do.

This means that I would need v8 bindings for ruby, perl and python. Not to mention framework support for things like rails and twisted.

Not going there.

The solution to this is really quite simple. Add the following line to /etc/portage/package.mask.


>dev-db/postgresql-8.0.0


No DevFS

As close to the metal as Gentoo is, sometimes it does try and get ahead of itself. Like automatic /dev management detection. Within /etc/conf.d/rc, there's a new option called RC_DEVICES. It's set to "auto". This means that it tries to automatically detect if you're using devfs, udev or static device links.

My kernel was compled with devfs support, but I wasn't using it. Gentoo detected it and assumed that devfs was my preference. Ugh.

Now I had a system that wouldn't boot. You can imagine what it was like trying to figure this out.

Anyhow, to get out of this mess you need to rebuild the kernel without DevFS support.

Bummer.

The Lesson

Do _NOT_ use "emerge world". Instead, use "emerge -pv world", find the packages that need to be upgraded, and emerge them individually.

Always use "-pv" to verify what portage is going to do before the real emerge.