2011 Progress (or lack thereof) PDF Print E-mail
Written by Administrator   
Sunday, 10 July 2011 19:43

2011 started out very well but has become a disaster.  The primary work on the project is performed on the justin.frdcsa.org server which has 4 TB of project data.  The project is divided up in many ways, including a git archive of all of the code, and a computer-specific data section that houses the application data. It was customary to store large files there so that they didn't bog down the git repo.

Unfortunately, while making extensive use of the data drive for a program for natural language paraphrase generation, the hard drive developed IRQ errors.  This was a serious problem because currently there is no backup infrastructure, which may appear to be a colossal lapse in judgment, but rather is attributable to a lack of project funding.  Going forward it is now the aim of the project to obtain a new server with a raid configuration (as well as an offsite backup server) in order to proof the system against possible future data loss.  Also, one goal of our fundraising is to have data recovery be performed on the drive that failed.

Why should this matter?  The reason this matters is that the FRDCSA is engaged in long term work to improve the state of open source software as it relates to improving the human condition. Whereas many AI projects are military in nature, this project differs in that the goal of the project is to provide tools that benefit individuals (and not military organizations), especially those in poverty and those with disabilities.  The proliferation of cheap computers and smart phones, in conjunction with the capabilities of free software, enables essential services to be rendered, including a medical diagnostic system, a meal planning system, and a personal life planning and organization assistant. The aim is to augment the social safety net and to enhance the lives of people with tools for improving organization. Unfortunately in a for profit cultural environment most of these tools have not yet been created as free software.  All of these tools benefit from the capabilities of software that is gathered and stored by the FRDCSA.  An original objective of the project is to package previously unpackaged yet applicable software for Debian GNU+Linux.

On a more positive note, much technical progress has been made this year.  Work has focused on the end-to-end life planning system, with the construction of additional parts of the Free Life Planning system.

http://frdcsa.org/~andrewdo/WebWiki/FreeLifePlanningCoachSoftware.html

We already have a system that helps the user set goals and compute and execute plans to achieve them (SPSE2/IEM).  By contrast, action planner is concerned with weighing the importance of these goals in various situations, to enable automated replanning in cases where all the objectives cannot be met or in cases where unexpected and unmodelled failures occur.

Much work had been done on improving the Interactive Execution Monitor, which walks users through plans.  Unfortunately most of that seems lost as it was not yet committed to the git repository.

And lately I have taken to working more on the natural language understanding components of the system, and in particular the Emacs environment for asserting knowledge from text.  It would be very nice to acquire some of the recent efforts at ontology population from text.  Fortunately, most of the methods from the Capability::TextAnalysis module have been made functional again, however, NLU research progresses more along the lines of NL-Soar in that we are looking to create a deliberating process that knows the timing and time complexity of various tasks and uses that information to guide the search for answers.  But it is clearly a work in progress.

More information about the failure can be found as quoted in this Facebook post:

The hard drive is definitely hosed, will submit it to recovery operations once I get the funds (perhaps through grant procurement). There was no backup to the secondary drive. The reason the data was not stored in git was it was personal data for applications. Total lost was 600 GB. The main FRDCSA systems are of course still intact but their data sections will have been lost. It should have been rsynced to a remote drive. I am redesigning the data storage systems, would appreciate any knowledge of data storage like NAS or RAID, any advice to get the operation moving in the right direction especially as regards disaster recovery and recovery from hacking. The hard drive was a Seagate 1.5 TB drive that was known to brick frequently, got it on discount without that information. I was writing a natural language paraphrase generation program using the TERp data and was using MLDBM tied hashes with 800MB of data, and the program ran for 4 or 5 hours. Guess that was too much. It was main data storage, contained many virtual machines and did many other data intensive processes. Hoping the platter has not flaked so that there would be a chance of a full or at least partial backup. Drive makes clicking noises and read errors, discontinuing use. The drive itself contained sole backup of my laptop after that filesystem failed. All in all a sad day. However, sometimes I find losses like this refreshing in that you can build a new direction free from the errors of the past. Many design decisions are now undone but I can do it better this time around. Sort of like simulated annealing.
 


Add this page to your favorite Social Bookmarking websites
Reddit! Del.icio.us! JoomlaVote! Google! Live! Facebook! StumbleUpon! Yahoo! Free social bookmarking plugins and extensions for Joomla! websites!
Last Updated ( Sunday, 10 July 2011 19:44 )
 
Free template 'Feel Free' by [ Anch ] Gorsk.net Studio. Please, don't remove this hidden copyleft!