How to I get a python script to run when a file is added to a specific directory?

I have a log file being written by another process which I want to watch for changes. Each time a change occurrs I'd like to read the new data in to do some processing on it.

What's the best way to do this? I was hoping there'd be some sort of hook from the PyWin32 library. I've found the win32file.FindNextChangeNotification function but have no idea how to ask it to watch a specific file.

If anyone's done anything like this I'd be really grateful to hear how...

[Edit] I should have mentioned that I was after a solution that doesn't require polling.

[Edit] Curses! It seems this doesn't work over a mapped network drive. I'm guessing windows doesn't 'hear' any updates to the file the way it does on a local disk.

Replay

Have you already looked at the documentation available on http://timgolden.me.uk/python/win32_how_do_i/watch_directory_for_changes.html? If you only need it to work under Windows the 2nd example seems to be exactly what you want (if you exchange the path of the directory with the one of the file you want to watch).

Otherwise, polling will probably be the only really platform-independent option.

Note: I haven't tried any of these solutions.

Did you try using Watchdog? http://packages.python.org/watchdog/

If polling is good enough for you, I'd just watch if the "modified time" file stat changes. To read it:

os.stat(filename).st_mtime

(Also note that the Windows native change event solution does not work in all circumstances, e.g. on network drives.)

If you want a multiplatform solution, then check QFileSystemWatcher. Here an example code (not sanitized):

from PyQt4 import QtCore

@QtCore.pyqtSlot(str)
def directory_changed(path):
    print('Directory Changed!!!')

@QtCore.pyqtSlot(str)
def file_changed(path):
    print('File Changed!!!')

fs_watcher = QtCore.QFileSystemWatcher(['/path/to/files_1', '/path/to/files_2', '/path/to/files_3'])

fs_watcher.connect(fs_watcher, QtCore.SIGNAL('directoryChanged(QString)'), directory_changed)
fs_watcher.connect(fs_watcher, QtCore.SIGNAL('fileChanged(QString)'), file_changed)

It should not work on windows (maybe with cygwin ?), but for unix user, you should use the "fcntl" system call. Here is an example in Python. It's mostly the same code if you need to write it in C (same function names)

import time
import fcntl
import os
import signal

FNAME = "/HOME/TOTO/FILETOWATCH"

def handler(signum, frame):
    print "File %s modified" % (FNAME,)

signal.signal(signal.SIGIO, handler)
fd = os.open(FNAME,  os.O_RDONLY)
fcntl.fcntl(fd, fcntl.F_SETSIG, 0)
fcntl.fcntl(fd, fcntl.F_NOTIFY,
            fcntl.DN_MODIFY | fcntl.DN_CREATE | fcntl.DN_MULTISHOT)

while True:
    time.sleep(10000)

Check out pyinotify.

inotify replaces dnotify (from an earlier answer) in newer linuxes and allows file-level rather than directory-level monitoring.

Well after a bit of hacking of Tim Golden's script, I have the following which seems to work quite well:

import os

import win32file
import win32con

path_to_watch = "." # look at the current directory
file_to_watch = "test.txt" # look for changes to a file called test.txt

def ProcessNewData( newData ):
    print "Text added: %s"%newData

# Set up the bits we'll need for output
ACTIONS = {
  1 : "Created",
  2 : "Deleted",
  3 : "Updated",
  4 : "Renamed from something",
  5 : "Renamed to something"
}
FILE_LIST_DIRECTORY = 0x0001
hDir = win32file.CreateFile (
  path_to_watch,
  FILE_LIST_DIRECTORY,
  win32con.FILE_SHARE_READ | win32con.FILE_SHARE_WRITE,
  None,
  win32con.OPEN_EXISTING,
  win32con.FILE_FLAG_BACKUP_SEMANTICS,
  None
)

# Open the file we're interested in
a = open(file_to_watch, "r")

# Throw away any exising log data
a.read()

# Wait for new data and call ProcessNewData for each new chunk that's written
while 1:
  # Wait for a change to occur
  results = win32file.ReadDirectoryChangesW (
    hDir,
    1024,
    False,
    win32con.FILE_NOTIFY_CHANGE_LAST_WRITE,
    None,
    None
  )

  # For each change, check to see if it's updating the file we're interested in
  for action, file in results:
    full_filename = os.path.join (path_to_watch, file)
    #print file, ACTIONS.get (action, "Unknown")
    if file == file_to_watch:
        newText = a.read()
        if newText != "":
            ProcessNewData( newText )

It could probably do with a load more error checking, but for simply watching a log file and doing some processing on it before spitting it out to the screen, this works well.

Thanks everyone for your input - great stuff!

Check my answer to a similar question. You could try the same loop in Python. This page suggests:

import time

while 1:
    where = file.tell()
    line = file.readline()
    if not line:
        time.sleep(1)
        file.seek(where)
    else:
        print line, # already has newline

Also see the question tail() a file with Python.

Well, since you are using Python, you can just open a file and keep reading lines from it.

f = open('file.log')

If the line read is not empty, you process it.

line = f.readline()
if line:
    // Do what you want with the line

You may be missing that it is ok to keep calling readline at the EOF. It will just keep returning an empty string in this case. And when something is appended to the log file, the reading will continue from where it stopped, as you need.

If you are looking for a solution that uses events, or a particular library, please specify this in your question. Otherwise, I think this solution is just fine.

Simplest solution for me is using watchdog's tool watchmedo

From https://pypi.python.org/pypi/watchdog I now have a process that looks up the sql files in a directory and executes them if necessary.

watchmedo shell-command \
--patterns="*.sql" \
--recursive \
--command='~/Desktop/load_files_into_mysql_database.sh' \
.

Here is a simplified version of Kender's code that appears to do the same trick and does not import the entire file:

# Check file for new data.

import time

f = open(r'c:\temp\test.txt', 'r')

while True:

    line = f.readline()
    if not line:
        time.sleep(1)
        print 'Nothing New'
    else:
        print 'Call Function: ', line

As you can see in Tim Golden's article, pointed by Horst Gutmann, WIN32 is relatively complex and watches directories, not a single file.

I'd like to suggest you look into IronPython, which is a .NET python implementation. With IronPython you can use all the .NET functionality - including

System.IO.FileSystemWatcher

Which handles single files with a simple Event interface.

This is another modification of Tim Goldan's script that runs on linux and adds a simple watcher for file modification by using a dict (file=>time).

usage: whateverName.py path_to_dir_to_watch

#!/usr/bin/env python

import os, sys, time

def files_to_timestamp(path):
    files = [os.path.join(path, f) for f in os.listdir(path)]
    return dict ([(f, os.path.getmtime(f)) for f in files])

if __name__ == "__main__":

    path_to_watch = sys.argv[1]
    print "Watching ", path_to_watch

    before = files_to_timestamp(path_to_watch)

    while 1:
        time.sleep (2)
        after = files_to_timestamp(path_to_watch)

        added = [f for f in after.keys() if not f in before.keys()]
        removed = [f for f in before.keys() if not f in after.keys()]
        modified = []

        for f in before.keys():
            if not f in removed:
                if os.path.getmtime(f) != before.get(f):
                    modified.append(f)

        if added: print "Added: ", ", ".join(added)
        if removed: print "Removed: ", ", ".join(removed)
        if modified: print "Modified ", ", ".join(modified)

        before = after

ACTIONS = {
  1 : "Created",
  2 : "Deleted",
  3 : "Updated",
  4 : "Renamed from something",
  5 : "Renamed to something"
}
FILE_LIST_DIRECTORY = 0x0001

class myThread (threading.Thread):
    def __init__(self, threadID, fileName, directory, origin):
        threading.Thread.__init__(self)
        self.threadID = threadID
        self.fileName = fileName
        self.daemon = True
        self.dir = directory
        self.originalFile = origin
    def run(self):
        startMonitor(self.fileName, self.dir, self.originalFile)

def startMonitor(fileMonitoring,dirPath,originalFile):
    hDir = win32file.CreateFile (
        dirPath,
        FILE_LIST_DIRECTORY,
        win32con.FILE_SHARE_READ | win32con.FILE_SHARE_WRITE,
        None,
        win32con.OPEN_EXISTING,
        win32con.FILE_FLAG_BACKUP_SEMANTICS,
        None
    )
    # Wait for new data and call ProcessNewData for each new chunk that's
    # written
    while 1:
        # Wait for a change to occur
        results = win32file.ReadDirectoryChangesW (
            hDir,
            1024,
            False,
            win32con.FILE_NOTIFY_CHANGE_LAST_WRITE,
            None,
            None
        )
        # For each change, check to see if it's updating the file we're
        # interested in
        for action, file_M in results:
            full_filename = os.path.join (dirPath, file_M)
            #print file, ACTIONS.get (action, "Unknown")
            if len(full_filename) == len(fileMonitoring) and action == 3:
                #copy to main file
                ...

You might want to take a look at this:

By default it "watches" a whole directory rather than a single file but the modification should be pretty straightforward.

This is an example of checking a file for changes. One that may not be the best way of doing it, but it sure is a short way.

Handy tool for restarting application when changes have been made to the source. I made this when playing with pygame so I can see effects take place immediately after file save.

When used in pygame make sure the stuff in the 'while' loop is placed in your game loop aka update or whatever. Otherwise your application will get stuck in an infinite loop and you will not see your game updating.

file_size_stored = os.stat('neuron.py').st_size

  while True:
    try:
      file_size_current = os.stat('neuron.py').st_size
      if file_size_stored != file_size_current:
        restart_program()
    except:
      pass

In case you wanted the restart code which I found on the web. Here it is. (Not relevant to the question, though it could come in handy)

def restart_program(): #restart application
    python = sys.executable
    os.execl(python, python, * sys.argv)

Have fun making electrons do what you want them to do.

If you have Windows Server 2003 Resource Kit Tools installed, you can start a TAIL command in a subprocess.

I don't know any Windows specific function. You could try getting the MD5 hash of the file every second/minute/hour (depends on how fast you need it) and compare it to the last hash. When it differs you know the file has been changed and you read out the newest lines.

I'd try something like this.

    try:
            f = open(filePath)
    except IOError:
            print "No such file: %s" % filePath
            raw_input("Press Enter to close window")
    try:
            lines = f.readlines()
            while True:
                    line = f.readline()
                    try:
                            if not line:
                                    time.sleep(1)
                            else:
                                    functionThatAnalisesTheLine(line)
                    except Exception, e:
                            # handle the exception somehow (for example, log the trace) and raise the same exception again
                            raw_input("Press Enter to close window")
                            raise e
    finally:
            f.close()

The loop checks if there is a new line(s) since last time file was read - if there is, it's read and passed to the functionThatAnalisesTheLine function. If not, script waits 1 second and retries the process.

Category: python Time: 2008-10-08 Views: 3

Related post

iOS development

Android development

Python development

JAVA development

Development language

PHP development

Ruby development

search

Front-end development

Database

development tools

Open Platform

Javascript development

.NET development

cloud computing

server

Copyright (C) avrocks.com, All Rights Reserved.

processed in 0.134 (s). 12 q(s)