Commit 9ce9f55f authored by Ivan Tyagov's avatar Ivan Tyagov

Initial import of 'MimetypesRegistry' and 'PortalTransforms' products.


git-svn-id: https://svn.erp5.org/repos/public/erp5/trunk@16694 20353a03-c40f-0410-a6d1-a30d3c3de9de
parent 896efdee
DONT USE ChangeLog use HISTORY.txt instead.
2004-07-24 Christian Heimes <heimes@faho.rwth-aachen.de>
* Changed version to stick to Archetypes version.
import os
from Products.CMFCore.DirectoryView import addDirectoryViews, registerDirectory, \
createDirectoryView, manage_listAvailableDirectories
from Products.CMFCore.utils import getToolByName, minimalpath
from Globals import package_home
from OFS.ObjectManager import BadRequestException
from Products.MimetypesRegistry import GLOBALS, skins_dir
from Products.MimetypesRegistry.interfaces import IMimetypesRegistry
from Acquisition import aq_base
from StringIO import StringIO
def install(self):
out = StringIO()
id = 'mimetypes_registry'
if hasattr(aq_base(self), id):
mtr = getattr(self, id)
if not IMimetypesRegistry.isImplementedBy(mtr) or \
not getattr(aq_base(mtr), '_new_style_mtr', None) == 1:
print >>out, 'Removing old mimetypes registry tool'
self.manage_delObjects([id,])
if not hasattr(self, id):
addTool = self.manage_addProduct['MimetypesRegistry'].manage_addTool
addTool('MimeTypes Registry')
print >>out, 'Installing mimetypes registry tool'
skinstool=getToolByName(self, 'portal_skins')
fullProductSkinsPath = os.path.join(package_home(GLOBALS), skins_dir)
productSkinsPath = minimalpath(fullProductSkinsPath)
registered_directories = manage_listAvailableDirectories()
if productSkinsPath not in registered_directories:
registerDirectory(skins_dir, GLOBALS)
try:
addDirectoryViews(skinstool, skins_dir, GLOBALS)
except BadRequestException, e:
pass # directory view has already been added
files = os.listdir(fullProductSkinsPath)
for productSkinName in files:
if os.path.isdir(os.path.join(fullProductSkinsPath, productSkinName)) \
and productSkinName != 'CVS':
for skinName in skinstool.getSkinSelections():
path = skinstool.getSkinPath(skinName)
path = [i.strip() for i in path.split(',')]
try:
if productSkinName not in path:
path.insert(path.index('custom') +1, productSkinName)
except ValueError:
if productSkinName not in path:
path.append(productSkinName)
path = ','.join(path)
skinstool.addSkinSelection(skinName, path)
return out.getvalue()
def fixUpSMIGlobs(self):
from Products.MimetypesRegistry.mime_types import smi_mimetypes
from Products.Archetypes.debug import log
mtr = getToolByName(self, 'mimetypes_registry')
smi_mimetypes.initialize(mtr)
# Now comes the fun part. For every glob, lookup a extension
# matching the glob and unregister it.
for glob in mtr.globs.keys():
if mtr.extensions.has_key(glob):
log('Found glob %s in extensions registry, removing.' % glob)
mti = mtr.extensions[glob]
del mtr.extensions[glob]
if glob in mti.extensions:
log('Found glob %s in mimetype %s extensions, '
'removing.' % (glob, mti))
exts = list(mti.extensions)
exts.remove(glob)
mti.extensions = tuple(exts)
mtr.register(mti)
1.4.0-final - 2006-06-16
========================
* Use zope.contenttype in favor of zope.app.content_types if available.
[hannosch]
1.4.0-beta2 - 2006-05-12
========================
* Use zope.app.content_types in favor of OFS.content_types if available.
[stefan]
* Spring-cleaning of tests infrastructure.
[hannosch]
1.4.0-beta1 - 2006-03-26
========================
* fixed Plone #5027: MimeTypeRegistry.classify doesn't handle
"no mimetype" gracefully. Returns 'None' now.
[jensens]
* fixed http://dev.plone.org/archetypes/ticket/622
[jensens]
1.4.0-alpha02 - 2006-02-23
==========================
* ensured that the key gotten back from windows_mimetypes.py existed
mark says the best way is to examine each key to ensure its valid but
would be slower.
[runyaga]
* removed odd archetypes 1.3 style version checking
[jensens]
* Removed BBB code for CMFCorePermissions import location.
[hannosch]
* removed deprecation warning for ToolInit.
[jensens]
* skip backward compatibility to the times where MTR where part of
PortalTransforms.
[jensens]
1.3.8-final02 - 2006-01-15
==========================
* nothing - the odd version checking needs a version change to stick to
Archetypes version again.
[yenzenz]
1.3.8-RC1 - 2005-12-29
======================
* Split yet another part of register() into a separate
method. Cleanup smi_mimetypes initialize a little bit to to use
the new method when adding new mimetypes to a already-registered
entry.
[dreamcatcher]
* Include aliases in the list of mimetypes for a entry. Based on
patch by Jean Jordaan
[dreamcatcher]
* Use a SAX-based parser instead of minidom to improve Zope startup
time (by 17 seconds on my Pismo) and memory footprint.
[dreamcatcher]
* Augment known mimetypes with Windows mimetypes, if available.
[dreamcatcher]
1.3.7-final01 - 2005-10-11
==========================
* For the sake of sanity, include a 'mime.types' with
MimetypesRegistry to minimize the platform-specific differences in
mime detection when the python 'mimetypes' module is involved.
[dreamcatcher]
* globs from freedesktop.org shared-mime-info were incorrectly
mapped to 'extensions' and never really worked because the code
tried to strip a leading dot, where the globs normally had '*.'.
The side-effect of this is that in *nix, the Python 'mimetypes'
module would happily read '/etc/mime.types' and gracefully work
(/etc/mime.types has most of the extensions of shared-mime-info
but a few), where on Windows it would fail to detect mimetypes by
extension.
[dreamcatcher]
* Added support for real globs, using fnmatch.translate and
re.compile and a migration function that will be run from Plone
2.1.1 migration, with some tests specific for globs read from
shared-mime-info.
[dreamcatcher]
1.3.6-final01 - 2005-08-30
==========================
* after one night sleeping over it I removed the yesterday added method.
therefore I added according to some heuristics and OOo-Documentation
some magic bytes to magic.py and made better tests.
[yenzenz]
* added a method to detect mimetypes of zipped files,
here specialy for OOo now all Openofice files and zip
files are detected properly. my simple tests are working:
a OOo-Writer and a simpe zipfile are detected.
[yenzenz]
* updated freedesktop.org.xml file to latest CVS version rev 1.57 from
http://cvs.freedesktop.org/mime/shared-mime-info/freedesktop.org.xml
[yenzenz]
1.3.5-final03 - 2005-08-07
==========================
* nothing - the odd version checking needs a version change to stick to
Archetypes version again.
[yenzenz]
1.3.5-final02 - 2005-08-01
==========================
* nothing again, need to stick to Archetypes version
[yenzenz]
1.3.5-final - 2005-07-17
========================
* Added Five/Zope3 interface bridges and implements
[tiran]
1.3.4-final - 2005-07-06
========================
* added icons for openoffice.org files
[yenzenz]
1.3.3-final06 - 2005-05-20
==========================
* nothing (I hate to write this. But the odd version checking needs it).
[yenzenz]
1.3.3-final-02 - 2005-03-25
===========================
* nothing
1.3.3-final - 2005-03-05
========================
* More a workaround than a fix for [ 1056252 ] Content type algorithm
can be confused.
[tiran]
* workaround for [ 1068001 ] BaseUnit Encoding Error: macintosh
[yenzenz]
* In the case all else fails, try to resort to guess_content_type so
that at least we don't get 'text/plain' when the file is in fact a
binary file.
[dreamcatcher]
1.3.2-5 - 2004-09-30
====================
* nothing
1.3.2-4 - 2004-09-30
====================
* nothing
1.3.2-3 - 2004-09-25
====================
* nothing
1.3.2-2 - 2004-09-17
====================
* nothing
1.3.2-1 - 2004-09-04
====================
* Cleaned up major parts of PT by removing the python only implementation which
was broken anyway
[tiran]
1.3.1-1 - 2004-08-16
====================
* Added text/x-html-safe mime type for new transformation
[tiran]
* Don't return acquisition wrapped mimetype items beause they may lead to
memory leaks.
[tiran]
1.3.0-3 - 2004-08-06
====================
* Added text/wiki mime type
[tiran]
* Don't log redefine warning if the currrent and the new object are equal
[tiran]
* initialize() MTR on __setstate__ aka when the MTR is loaded from ZODB.
[tiran]
1.3.0-2 - 2004-07-29
====================
* Nothing changed
Copyright (c) 2002-2003, Benjamin Saller <bcsaller@ideasuite.com>, and
the respective authors.
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are
met:
* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above
copyright notice, this list of conditions and the following disclaimer
in the documentation and/or other materials provided with the
distribution.
* Neither the name of Archetypes nor the names of its contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
import os
from Acquisition import Explicit
from OFS.SimpleItem import Item
from AccessControl import ClassSecurityInfo
from Globals import Persistent, InitializeClass
from Products.CMFCore.permissions import ManagePortal
from Products.MimetypesRegistry.interfaces import IMimetype
from Products.MimetypesRegistry.common import MimeTypeException
class MimeTypeItem(Persistent, Explicit, Item):
security = ClassSecurityInfo()
__implements__ = (IMimetype,)
extensions = ()
globs = ()
def __init__(self, name='', mimetypes=None, extensions=None,
binary=None, icon_path='', globs=None):
if name:
self.__name__ = self.id = name
if mimetypes is not None:
self.mimetypes = mimetypes
if extensions is not None:
self.extensions = extensions
if binary is not None:
self.binary = binary
if globs is not None:
self.globs = globs
self.icon_path = icon_path or guess_icon_path(self)
def __str__(self):
return self.normalized()
def __repr__(self):
return "<mimetype %s>" % self.mimetypes[0]
def __cmp__(self, other):
try:
if isinstance(other, mimetype):
other = other.normalized()
except:
pass
return not (other in self.mimetypes)
def __hash__(self):
return hash(self.name())
security.declarePublic('name')
def name(self):
""" The name of this object """
return self.__name__
security.declarePublic('major')
def major(self):
""" return the major part of the RFC-2046 name for this mime type """
return self.normalized().split('/', 1)[0]
security.declarePublic('minor')
def minor(self):
""" return the minor part of the RFC-2046 name for this mime type """
return self.normalized().split('/', 1)[1]
security.declarePublic('normalized')
def normalized(self):
""" return the main RFC-2046 name for this mime type
e.g. if this object has names ('text/restructured', 'text-x-rst')
then self.normalized() will always return the first form.
"""
return self.mimetypes[0]
security.declareProtected(ManagePortal, 'edit')
def edit(self, name, mimetypes, extensions, icon_path,
binary=0, globs=None, REQUEST=None):
"""edit this mime type"""
# if mimetypes and extensions are string instead of lists,
# split them on new lines
if isinstance(mimetypes, basestring):
mimetypes = [mts.strip() for mts in mimetypes.split('\n')
if mts.strip()]
if isinstance(extensions, basestring):
extensions = [mts.strip() for mts in extensions.split('\n')
if mts.strip()]
if isinstance(globs, basestring):
globs = [glob.strip() for glob in globs.split('\n')
if glob.strip()]
self.__name__ = self.id = name
self.mimetypes = mimetypes
self.globs = globs
self.extensions = extensions
self.binary = binary
self.icon_path = icon_path
if REQUEST is not None:
REQUEST['RESPONSE'].redirect(self.absolute_url()+'/manage_main')
InitializeClass(MimeTypeItem)
ICONS_DIR = os.path.join(os.path.dirname(__file__), 'skins', 'mimetypes_icons')
def guess_icon_path(mimetype, icons_dir=ICONS_DIR, icon_ext='png'):
if mimetype.extensions:
for ext in mimetype.extensions:
icon_path = '%s.%s' % (ext, icon_ext)
if os.path.exists(os.path.join(icons_dir, icon_path)):
return icon_path
icon_path = '%s.png' % mimetype.major()
if os.path.exists(os.path.join(icons_dir, icon_path)):
return icon_path
return 'unknown.png'
import os
import re
import fnmatch
from types import UnicodeType
from OFS.Folder import Folder
from Globals import InitializeClass
from Acquisition import aq_parent
from Acquisition import aq_base
from Globals import PersistentMapping
from AccessControl import ClassSecurityInfo
from BTrees.OOBTree import OOBTree
from Products.CMFCore.permissions import ManagePortal
from Products.CMFCore.ActionProviderBase import ActionProviderBase
from Products.CMFCore.TypesTool import FactoryTypeInformation
from Products.CMFCore.utils import UniqueObject
from Products.PageTemplates.PageTemplateFile import PageTemplateFile
from Products.MimetypesRegistry.interfaces import ISourceAdapter
from Products.MimetypesRegistry.interfaces import IMimetypesRegistry
from Products.MimetypesRegistry.interfaces import IMimetype
from Products.MimetypesRegistry.interfaces import IClassifier
from Products.MimetypesRegistry.MimeTypeItem import MimeTypeItem
from Products.MimetypesRegistry.mime_types import initialize
from Products.MimetypesRegistry.mime_types import magic
from Products.MimetypesRegistry.common import log
from Products.MimetypesRegistry.common import MimeTypeException
from Products.MimetypesRegistry.common import STRING_TYPES
from Products.MimetypesRegistry.common import _www
from Products.MimetypesRegistry.encoding import guess_encoding
from Products.MimetypesRegistry.common import log
try:
from zope.contenttype import guess_content_type
except ImportError: # BBB: Zope < 2.10
try:
from zope.app.content_types import guess_content_type
except ImportError: # BBB: Zope < 2.9
from OFS.content_types import guess_content_type
suffix_map = {
'tgz': '.tar.gz',
'taz': '.tar.gz',
'tz': '.tar.gz',
}
encodings_map = {
'gz': 'gzip',
'Z': 'compress',
}
class MimeTypesRegistry(UniqueObject, ActionProviderBase, Folder):
"""Mimetype registry that deals with
a) registering types
b) wildcarding of rfc-2046 types
c) classifying data into a given type
"""
__implements__ = (IMimetypesRegistry, ISourceAdapter)
id = 'mimetypes_registry'
meta_type = 'MimeTypes Registry'
isPrincipiaFolderish = 1 # Show up in the ZMI
meta_types = all_meta_types = (
{ 'name' : 'MimeType',
'action' : 'manage_addMimeTypeForm'},
)
manage_options = (
( { 'label' : 'MimeTypes',
'action' : 'manage_main'},) +
Folder.manage_options[2:]
)
manage_addMimeTypeForm = PageTemplateFile('addMimeType', _www)
manage_main = PageTemplateFile('listMimeTypes', _www)
manage_editMimeTypeForm = PageTemplateFile('editMimeType', _www)
security = ClassSecurityInfo()
# FIXME
__allow_access_to_unprotected_subobjects__ = 1
def __init__(self,):
self.encodings_map = encodings_map.copy()
self.suffix_map = suffix_map.copy()
# Major key -> minor IMimetype objects
self._mimetypes = PersistentMapping()
# ext -> IMimetype mapping
self.extensions = PersistentMapping()
# glob -> (regex, mimetype) mapping
self.globs = OOBTree()
self.manage_addProperty('defaultMimetype', 'text/plain', 'string')
self.manage_addProperty('unicodePolicies', 'strict ignore replace',
'tokens')
self.manage_addProperty('unicodePolicy', 'unicodePolicies', 'selection')
self.manage_addProperty('fallbackEncoding', 'latin1', 'string')
# initialize mime types
initialize(self)
self._new_style_mtr = 1
security.declareProtected(ManagePortal, 'register')
def register(self, mimetype):
""" Register a new mimetype
mimetype must implement IMimetype
"""
mimetype = aq_base(mimetype)
assert IMimetype.isImplementedBy(mimetype)
for t in mimetype.mimetypes:
self.register_mimetype(t, mimetype)
for extension in mimetype.extensions:
self.register_extension(extension, mimetype)
for glob in mimetype.globs:
self.register_glob(glob, mimetype)
security.declareProtected(ManagePortal, 'register_mimetype')
def register_mimetype(self, mt, mimetype):
major, minor = split(mt)
if not major or not minor or minor == '*':
raise MimeTypeException('Can\'t register mime type %s' % mt)
group = self._mimetypes.setdefault(major, PersistentMapping())
if group.has_key(minor):
if group.get(minor) != mimetype:
log('Warning: redefining mime type %s (%s)' % (
mt, mimetype.__class__))
group[minor] = mimetype
security.declareProtected(ManagePortal, 'register_extension')
def register_extension(self, extension, mimetype):
""" Associate a file's extension to a IMimetype
extension is a string representing a file extension (not
prefixed by a dot) mimetype must implement IMimetype
"""
mimetype = aq_base(mimetype)
if self.extensions.has_key(extension):
if self.extensions.get(extension) != mimetype:
log('Warning: redefining extension %s from %s to %s' % (
extension, self.extensions[extension], mimetype))
# we don't validate fmt yet, but its ["txt", "html"]
self.extensions[extension] = mimetype
security.declareProtected(ManagePortal, 'register_glob')
def register_glob(self, glob, mimetype):
""" Associate a glob to a IMimetype
glob is a shell-like glob that will be translated to a regex
to match against whole filename.
mimetype must implement IMimetype.
"""
globs = getattr(self, 'globs', None)
if globs is None:
self.globs = globs = OOBTree()
mimetype = aq_base(mimetype)
existing = globs.get(glob)
if existing is not None:
regex, mt = existing
if mt != mimetype:
log('Warning: redefining glob %s from %s to %s' % (
glob, mt, mimetype))
# we don't validate fmt yet, but its ["txt", "html"]
pattern = re.compile(fnmatch.translate(glob))
globs[glob] = (pattern, mimetype)
security.declareProtected(ManagePortal, 'unregister')
def unregister(self, mimetype):
""" Unregister a new mimetype
mimetype must implement IMimetype
"""
assert IMimetype.isImplementedBy(mimetype)
for t in mimetype.mimetypes:
major, minor = split(t)
group = self._mimetypes.get(major, {})
if group.get(minor) == mimetype:
del group[minor]
for e in mimetype.extensions:
if self.extensions.get(e) == mimetype:
del self.extensions[e]
globs = getattr(self, 'globs', None)
if globs is not None:
for glob in mimetype.globs:
existing = globs.get(glob)
if existing is None:
continue
regex, mt = existing
if mt == mimetype:
del globs[glob]
security.declarePublic('mimetypes')
def mimetypes(self):
"""Return all defined mime types, each one implements at least
IMimetype
"""
res = {}
for g in self._mimetypes.values():
for mt in g.values():
res[mt] =1
return [aq_base(mtitem) for mtitem in res.keys()]
security.declarePublic('list_mimetypes')
def list_mimetypes(self):
"""Return all defined mime types, as string"""
return [str(mt) for mt in self.mimetypes()]
security.declarePublic('lookup')
def lookup(self, mimetypestring):
"""Lookup for IMimetypes object matching mimetypestring
mimetypestring may have an empty minor part or containing a
wildcard (*) mimetypestring may and IMimetype object (in this
case it will be returned unchanged
Return a list of mimetypes objects associated with the
RFC-2046 name return an empty list if no one is known.
"""
if IMimetype.isImplementedBy(mimetypestring):
return (aq_base(mimetypestring), )
__traceback_info__ = (repr(mimetypestring), str(mimetypestring))
major, minor = split(str(mimetypestring))
group = self._mimetypes.get(major, {})
if not minor or minor == '*':
res = group.values()
else:
res = group.get(minor)
if res:
res = (res,)
else:
return ()
return tuple([aq_base(mtitem) for mtitem in res])
security.declarePublic('lookupExtension')
def lookupExtension(self, filename):
"""Lookup for IMimetypes object matching filename
Filename maybe a file name like 'content.txt' or an extension
like 'rest'
Return an IMimetype object associated with the file's
extension or None
"""
if filename.find('.') != -1:
base, ext = os.path.splitext(filename)
ext = ext[1:] # remove the dot
while self.suffix_map.has_key(ext):
base, ext = os.path.splitext(base + self.suffix_map[ext])
ext = ext[1:] # remove the dot
else:
ext = filename
base = None
# XXX This code below make no sense and may break because base
# isn't defined.
if self.encodings_map.has_key(ext) and base:
encoding = self.encodings_map[ext]
base, ext = os.path.splitext(base)
ext = ext[1:] # remove the dot
else:
encoding = None
return aq_base(self.extensions.get(ext))
security.declarePublic('globFilename')
def globFilename(self, filename):
"""Lookup for IMimetypes object matching filename
Filename must be a complete filename with extension.
Return an IMimetype object associated with the glob's or None
"""
globs = getattr(self, 'globs', None)
if globs is None:
return None
for key in globs.keys():
glob, mimetype = globs[key]
if glob.match(filename):
return aq_base(mimetype)
return None
security.declarePublic('lookupGlob')
def lookupGlob(self, glob):
globs = getattr(self, 'globs', None)
if globs is None:
return None
return aq_base(globs.get(glob))
def _classifiers(self):
return [mt for mt in self.mimetypes() if IClassifier.isImplementedBy(mt)]
security.declarePublic('classify')
def classify(self, data, mimetype=None, filename=None):
"""Classify works as follows:
1) you tell me the rfc-2046 name and I give you an IMimetype
object
2) the filename includes an extension from which we can guess
the mimetype
3) we can optionally introspect the data
4) default to self.defaultMimetype if no data was provided
else to application/octet-stream of no filename was provided,
else to text/plain
Return an IMimetype object or None
"""
mt = None
if mimetype:
mt = self.lookup(mimetype)
if mt:
mt = mt[0]
elif filename:
mt = self.lookupExtension(filename)
if mt is None:
mt = self.globFilename(filename)
if data and not mt:
for c in self._classifiers():
if c.classify(data):
mt = c
break
if not mt:
mstr = magic.guessMime(data)
if mstr:
mt = self.lookup(mstr)[0]
if not mt:
if not data:
mtlist = self.lookup(self.defaultMimetype)
elif filename:
mtlist = self.lookup('application/octet-stream')
else:
failed = 'text/x-unknown-content-type'
filename = filename or ''
data = data or ''
ct, enc = guess_content_type(filename, data, None)
if ct == failed:
ct = 'text/plain'
mtlist = self.lookup(ct)
if len(mtlist)>0:
mt = mtlist[0]
else:
return None
# Remove acquisition wrappers
return aq_base(mt)
def __call__(self, data, **kwargs):
""" Return a triple (data, filename, mimetypeobject) given
some raw data and optional paramters
method from the isourceAdapter interface
"""
mimetype = kwargs.get('mimetype', None)
filename = kwargs.get('filename', None)
encoding = kwargs.get('encoding', None)
mt = None
if hasattr(data, 'filename'):
filename = os.path.basename(data.filename)
elif hasattr(data, 'name'):
filename = os.path.basename(data.name)
if hasattr(data, 'read'):
_data = data.read()
if hasattr(data, 'seek'):
data.seek(0)
data = _data
# We need to figure out if data is binary and skip encoding if
# it is
mt = self.classify(data, mimetype=mimetype, filename=filename)
if not mt.binary and not type(data) is UnicodeType:
# if no encoding specified, try to guess it from data
if encoding is None:
encoding = self.guess_encoding(data)
# ugly workaround for
# https://sourceforge.net/tracker/?func=detail&aid=1068001&group_id=75272&atid=543430
# covered by
# https://sourceforge.net/tracker/?func=detail&atid=355470&aid=843590&group_id=5470
# dont remove this code unless python is fixed.
if encoding is "macintosh":
encoding = 'mac_roman'
try:
try:
data = unicode(data, encoding, self.unicodePolicy)
except (ValueError, LookupError):
# wrong unicodePolicy
data = unicode(data, encoding)
except:
data = unicode(data, self.fallbackEncoding)
return (data, filename, aq_base(mt))
security.declarePublic('guess_encoding')
def guess_encoding(self, data):
""" Try to guess encoding from a text value if no encoding
guessed, used the default charset from site properties (Zope)
with a fallback to UTF-8 (should never happen with correct
site_properties, but always raise Attribute error without
Zope)
"""
if type(data) is type(u''):
# data maybe unicode but with another encoding specified
data = data.encode('UTF-8')
encoding = guess_encoding(data)
if encoding is None:
try:
site_props = self.portal_properties.site_properties
encoding = site_props.getProperty('default_charset', 'UTF-8')
except:
encoding = 'UTF-8'
return encoding
security.declareProtected(ManagePortal, 'manage_delObjects')
def manage_delObjects(self, ids, REQUEST=None):
""" delete the selected mime types """
for id in ids:
self.unregister(self.lookup(id)[0])
if REQUEST is not None:
REQUEST['RESPONSE'].redirect(self.absolute_url()+'/manage_main')
security.declareProtected(ManagePortal, 'manage_addMimeType')
def manage_addMimeType(self, id, mimetypes, extensions, icon_path,
binary=0, globs=None, REQUEST=None):
"""add a mime type to the tool"""
mt = MimeTypeItem(id, mimetypes, extensions=extensions,
binary=binary, icon_path=icon_path, globs=globs)
self.register(mt)
if REQUEST is not None:
REQUEST['RESPONSE'].redirect(self.absolute_url()+'/manage_main')
security.declareProtected(ManagePortal, 'manage_editMimeType')
def manage_editMimeType(self, name, new_name, mimetypes, extensions,
icon_path, binary=0, globs=None, REQUEST=None):
"""Edit a mime type by name
"""
mt = self.lookup(name)[0]
self.unregister(mt)
mt.edit(new_name, mimetypes, extensions, icon_path=icon_path,
binary=binary, globs=globs)
self.register(mt)
if REQUEST is not None:
REQUEST['RESPONSE'].redirect(self.absolute_url()+'/manage_main')
InitializeClass(MimeTypesRegistry)
def split(name):
""" split a mime type in a (major / minor) 2-uple """
try:
major, minor = name.split('/', 1)
except:
raise MimeTypeException('Malformed MIME type (%s)' % name)
return major, minor
# backward compatibility
from Products.MimetypesRegistry.MimeTypesRegistry import MimeTypesRegistry as MimeTypesTool
Mimetypes Registry
=================
* mimetypes_registry (the mimetypes tool) : handle mime types information
* portal_transform (the transform tool) : handle transformation of data from a
mime type to another
Documentation
-------------
See the *docs* directory in this package.
Mailing-list
------------
Discussion about this products occurs to the archetypes mailing list :
http://sourceforge.net/mail/?group_id=75272
or on the #plone channel of irc.freenode.net.
Authors
-------
Benjamin Saller <bcsaller@yahoo.com>
Sidnei da Silva <sidnei@x3ng.com>
Sylvain Thénault <sylvain.thenault@logilab.fr>
Christian Heimes <tiran@cheimes.de>
import os.path
__version__ = open(os.path.join(__path__[0], 'version.txt')).read().strip()
from Products.MimetypesRegistry import MimeTypesRegistry
from Products.MimetypesRegistry.common import skins_dir
GLOBALS = globals()
PKG_NAME = 'MimetypesRegistry'
tools = (
MimeTypesRegistry.MimeTypesRegistry,
)
from Products.MimetypesRegistry import mime_types
# TODO: figure out if this is used/needed anywhere
import sys
from Products.MimetypesRegistry import MimeTypeItem
sys.modules['Products.MimetypesRegistry.zope.MimeTypeItem'] = MimeTypeItem
# end TODO
def initialize(context):
from Products.CMFCore.DirectoryView import registerDirectory
registerDirectory(skins_dir, GLOBALS)
from Products.CMFCore import utils
utils.ToolInit("%s Tool" % PKG_NAME,
tools=tools,
icon="tool.gif",
).initialize(context)
from Products import MimetypesRegistry as PRODUCT
import os.path
version=PRODUCT.__version__
modname=PRODUCT.__name__
# (major, minor, patchlevel, release info) where release info is:
# -99 for alpha, -49 for beta, -19 for rc and 0 for final
# increment the release info number by one e.g. -98 for alpha2
major, minor, bugfix = version.split('.')[:3]
bugfix, release = bugfix.split('-')[:2]
relinfo=-99 #alpha
if 'beta' in release:
relinfo=-49
if 'rc' in release:
relinfo=-19
if 'final' in release:
relinfo=0
numversion = (int(major), int(minor), int(bugfix), relinfo)
license = 'BSD like'
license_text = open(os.path.join(PRODUCT.__path__[0], 'LICENSE.txt')).read()
copyright = '''Copyright (c) 2003 LOGILAB S.A. (Paris, FRANCE)'''
author = "Archetypes developement team"
author_email = "archetypes-devel@lists.sourceforge.net"
short_desc = "MIME types registry for the CMF"
long_desc = """This package provides a new CMF tools in order to
make MIME types guessings. You will find more info in the package's
README and docs directory.
.
It's part of the Archetypes project, but the only requirement to use it
is to have a CMF based site. If you are using Archetypes, this package
replaces the transform package.
.
Notice this package can also be used as a standalone Python package. If
you've downloaded the Python distribution, you can't make it a Zope
product since Zope files have been removed from this distribution.
"""
web = "http://plone.org/products/archetypes"
ftp = ""
mailing_list = "archetypes-devel@lists.sourceforge.net"
debian_name = "zope-cmfmtr"
debian_maintainer = "Christian Heimes (?)"
debian_maintainer_email = "tiran@cheimes.de"
debian_handler = "zope"
"""some common utilities
"""
FB_REGISTRY = None
# base class
from ExtensionClass import Base
from Acquisition import aq_base
# logging function
from zLOG import LOG, INFO
def log(msg, severity=INFO, id='PortalTransforms'):
LOG(id, severity, msg)
# directory where template for the ZMI are located
import os.path
_www = os.path.join(os.path.dirname(__file__), 'www')
skins_dir = os.path.join(os.path.dirname(__file__), 'skins')
# list and dict classes to use
from Globals import PersistentMapping as DictClass
try:
from ZODB.PersistentList import PersistentList as ListClass
except ImportError:
from persistent.list import PersistentList as ListClass
# interfaces
try:
# Zope >= 2.6
from Interface import Interface, Attribute
except ImportError:
# Zope < 2.6
from Interface import Base as Interface, Attribute
def implements(object, interface):
return interface.isImplementedBy(object)
# getToolByName
from Products.CMFCore.utils import getToolByName as _getToolByName
_marker = []
def getToolByName(context, name, default=_marker):
global FB_REGISTRY
tool = _getToolByName(context, name, default)
if name == 'mimetypes_registry' and tool is default:
if FB_REGISTRY is None:
from Products.MimetypesRegistry.MimeTypesRegistry \
import MimeTypesRegistry
FB_REGISTRY = MimeTypesRegistry()
tool = FB_REGISTRY
return tool
from zExceptions import BadRequest
__all__ = ('Base', 'log', 'DictClass', 'ListClass', 'getToolByName', 'aq_base',
'Interface', 'Attribute', 'implements', 'skins_dir', '_www',
'BadRequest', )
<configure
xmlns="http://namespaces.zope.org/five"
>
<bridge
zope2=".interfaces.IMimetype"
package=".z3.interfaces"
name="IMimetype"
/>
<bridge
zope2=".interfaces.IClassifier"
package=".z3.interfaces"
name="IClassifier"
/>
<bridge
zope2=".interfaces.ISourceAdapter"
package=".z3.interfaces"
name="ISourceAdapter"
/>
<bridge
zope2=".interfaces.IMimetypesRegistry"
package=".z3.interfaces"
name="IMimetypesRegistry"
/>
</configure>
"""some common utilities
"""
from time import time
from types import UnicodeType, StringType
STRING_TYPES = (UnicodeType, StringType)
class MimeTypeException(Exception):
pass
# logging function
from zLOG import LOG, INFO
def log(msg, severity=INFO, id='MimetypesRegistry'):
LOG(id, severity, msg)
# directory where template for the ZMI are located
import os.path
_www = os.path.join(os.path.dirname(__file__), 'www')
skins_dir = os.path.join(os.path.dirname(__file__), 'skins')
<configure
xmlns="http://namespaces.zope.org/zope"
>
<include file="bridge.zcml"/>
<include file="implements.zcml"/>
</configure>
import re
import encodings
from Products.MimetypesRegistry.common import log
EMACS_ENCODING_RGX = re.compile('[^#]*[#\s]*-\*-\s*coding: ([^\s]*)\s*-\*-\s*')
VIM_ENCODING_RGX = re.compile('[^#]*[#\s]*vim:fileencoding=\s*([^\s]*)\s*')
XML_ENCODING_RGX = re.compile('<\?xml version=[^\s]*\s*encoding=([^\s]*)\s*\?>')
CHARSET_RGX = re.compile('charset=([^\s"]*)')
def guess_encoding(buffer):
"""Better guess encoding method
It checks if python supports the encoding
"""
encoding = _guess_encoding(buffer)
# step 1: if the encoding was detected, use the lower() because python
# is using lower case names for encodings
if encoding and isinstance(encoding, basestring):
#encoding = encoding.lower()
pass
else:
return None
# try to find an encoding function for the encoding
# if None is returned or an exception is raised the encoding is invalid
try:
result = encodings.search_function(encoding.lower())
except:
# XXX log
result = None
if result:
# got a valid encoding
return encoding
else:
return None
def _guess_encoding(buffer):
"""try to guess encoding from a buffer
FIXME: it could be mime type driven but it seems less painful like that
"""
assert type(buffer) is type(''), type(buffer)
# default to ascii on empty buffer
if not buffer:
return 'ascii'
# check for UTF-8 byte-order mark
if buffer.startswith('\xef\xbb\xbf'):
return 'UTF-8'
first_lines = buffer.split('\n')[:2]
for line in first_lines:
# check for emacs encoding declaration
m = EMACS_ENCODING_RGX.match(line)
if m is not None:
return m.group(1)
# check for vim encoding declaration
m = VIM_ENCODING_RGX.match(line)
if m is not None:
return m.group(1)
# check for xml encoding declaration
if first_lines[0].startswith('<?xml'):
m = XML_ENCODING_RGX.match(first_lines[0])
if m is not None:
return m.group(1)[1:-1]
# xml files with no encoding declaration default to UTF-8
return 'UTF-8'
# try to get charset declaration
# FIXME: should we check it's html before ?
m = CHARSET_RGX.search(buffer)
if m is not None:
return m.group(1)
return None
<configure
xmlns="http://namespaces.zope.org/five"
>
<!-- TODO: more -->
</configure>
from Interface import Interface, Attribute
class IMimetype(Interface):
"""Specification for dealing with mimetypes RFC-2046 style"""
# mimetypes = Attribute("List of mimetypes in the RFC-2046 format")
# extensions = Attribute("""List of extensions mapped to this
# mimetype w/o the leading .""")
# binary = Attribute("""Boolean indicating if the mimetype should be
# treated as binary (and not human readable)""")
def name(self):
"""return the Human readable name of the mimetype"""
def major(self):
""" return the major part of the RFC-2046 name for this mime type """
def minor(self):
""" return the minor part of the RFC-2046 name for this mime type """
def normalized(self):
""" return the main RFC-2046 name for this mime type
e.g. if this object has names ('text/restructured', 'text-x-rst')
then self.normalized() will always return the first form.
"""
class IClassifier(Interface):
"""Optional mixin interface for imimetype, code to test if the
mimetype is present in data
"""
def classify(data):
""" boolean indicating if the data fits the mimetype"""
class ISourceAdapter(Interface):
def __call__(data, **kwargs):
"""convert data to unicode, may take optional kwargs to aid in
conversion"""
class IMimetypesRegistry(Interface):
def classify(data, mimetype=None, filename=None):
"""return a content type for this data or None
None should rarely be returned as application/octet can be
used to represent most types
"""
def lookup(mimetypestring):
"""Lookup for imimetypes object matching mimetypestring
mimetypestring may have an empty minor part or containing a wildcard (*)
mimetypestring may be an imimetype object (in this case it will be
returned unchanged, else it should be a RFC-2046 name
return a list of mimetypes objects associated with the RFC-2046 name
return an empty list if no one is known.
"""
def lookupExtension(filename):
""" return the mimetypes object associated with the file's extension
return None if it is not known.
filename maybe a file name like 'content.txt' or an extension like 'rest'
"""
def mimetypes():
"""return all defined mime types, each one implements at least imimetype
"""
def list_mimetypes():
"""return all defined mime types, as string"""
import mtr_mimetypes
import py_mimetypes
import smi_mimetypes
import suppl_mimetypes
import magic
from mtr_mimetypes import *
def initialize(registry):
mtr_mimetypes.initialize(registry)
smi_mimetypes.initialize(registry)
suppl_mimetypes.initialize(registry)
py_mimetypes.initialize(registry)
This source diff could not be displayed because it is too large. You can view the blob instead.
"""
magic.py
Initial Author: Jason Petrone <jp@demonseed.net>
Updated by Gabriel Wicke <groups@gabrielwicke.de>
Thu Oct 16 23:00:03 CEST 2003
with magic data from gnome-vfs-mime-magic
"""
import re
import struct
import string
from StringIO import StringIO
from zipfile import ZipFile
from zipfile import BadZipfile
from xml.dom import minidom
__version__ = '$Revision: 1.2 $'[11:-2]
magic = [
[0L, 'string', '=', '%PDF-', 'application/pdf'],
[0L, 'string', '=', '\177ELF', 'application/x-executable-binary'],
[0L, 'string', '=', '\004%!', 'application/postscript'],
[0L, 'string', '=', '\000\000\001\272', 'video/mpeg'],
[0L, 'string', '=', '\000\000\001\263', 'video/mpeg'],
[0L, 'string', '=', '\x47\x3f\xff\x10', 'video/mpeg'],
[0L, 'string', '=', '\377\330\377', 'image/jpeg'],
[0L, 'string', '=', '\xed\xab\xee\xdb', 'application/x-rpm'],
[0L, 'string', '=', 'Rar!', 'application/x-rar'],
[257L, 'string', '=', 'ustar\0', 'application/x-tar'],
[257L, 'string', '=', 'ustar\040\040\0', 'application/x-gtar'],
# the following detection of OOo is according to
# http://books.evc-cit.info/oobook/ch01.html
# and some heuristics found in hexeditor. if theres a better way to detect,
# we should replace the signatures below.
# best would to just read and evaluate the manifest file of the zip, but
# the magic tests are running on the first 8kB, so we cant unzip the
# manifest in files >8kB.
[30L, 'string', '=', 'mimetypeapplication/vnd.sun.xml.writer',
'application/vnd.sun.xml.writer'],
[30L, 'string', '=', 'mimetypeapplication/vnd.sun.xml.calc',
'application/vnd.sun.xml.calc'],
[30L, 'string', '=', 'mimetypeapplication/vnd.sun.xml.draw',
'application/vnd.sun.xml.draw'],
[30L, 'string', '=', 'mimetypeapplication/vnd.sun.xml.impress',
'application/vnd.sun.xml.impress'],
[30L, 'string', '=', 'mimetypeapplication/vnd.sun.xml.chart',
'application/vnd.sun.xml.chart'],
[30L, 'string', '=', 'mimetypeapplication/vnd.sun.xml.global',
'application/vnd.sun.xml.global'],
# zip works now, after we have it with lower priority than OOo
[0L, 'string', '=', 'PK\003\004', 'application/zip'],
[0L, 'string', '=', 'GIF8', 'image/gif'],
[4L, 'string', '=', 'moov', 'video/quicktime'],
[4L, 'string', '=', 'mdat', 'video/quicktime'],
[8L, 'string', '=', 'mp42', 'video/quicktime'],
[12L, 'string', '=', 'mdat', 'video/quicktime'],
[36L, 'string', '=', 'mdat', 'video/quicktime'],
[0L, 'belong', '=', '0x3026b275', 'video/x-ms-asf'],
[0L, 'string', '=', 'ASF ', 'audio/x-ms-asx'],
[0L, 'string', '=', '<ASX', 'audio/x-ms-asx'],
[0L, 'string', '=', '<asx', 'audio/x-ms-asx'],
[0L, 'string', '=', 'MThd', 'audio/x-midi'],
[0L, 'string', '=', 'IMPM', 'audio/x-it'],
[2L, 'string', '=', '-lh0-', 'application/x-lha'],
[2L, 'string', '=', '-lh1-', 'application/x-lha'],
[2L, 'string', '=', '-lz4-', 'application/x-lha'],
[2L, 'string', '=', '-lz5-', 'application/x-lha'],
[2L, 'string', '=', '-lzs-', 'application/x-lha'],
[2L, 'string', '=', '-lh\40-', 'application/x-lha'],
[2L, 'string', '=', '-lhd-', 'application/x-lha'],
[2L, 'string', '=', '-lh2-', 'application/x-lha'],
[2L, 'string', '=', '-lh3-', 'application/x-lha'],
[2L, 'string', '=', '-lh4-', 'application/x-lha'],
[2L, 'string', '=', '-lh5-', 'application/x-lha'],
[20L, 'string', '=', '\375\304\247\334', 'application/x-zoo'],
[0L, 'string', '=', 'StuffIt ', 'application/x-stuffit'],
[11L, 'string', '=', 'must be converted with BinHex', 'application/mac-binhex40'],
[102L, 'string', '=', 'mBIN', 'application/x-macbinary'],
[4L, 'string', '=', 'gtktalog ', 'application/x-gtktalog'],
[0L, 'string', '=', 'diff ', 'text/x-patch'],
[0L, 'string', '=', 'Index:', 'text/x-patch'],
[0L, 'string', '=', '*** ', 'text/x-patch'],
[0L, 'string', '=', 'Only in ', 'text/x-patch'],
[0L, 'string', '=', 'Common subdirectories: ', 'text/x-patch'],
[0L, 'string', '=', 'FONT', 'application/x-font-vfont'],
[0L, 'string', '=', 'IIN1', 'image/tiff'],
[0L, 'string', '=', 'MM\x00\x2a', 'image/tiff'],
[0L, 'string', '=', 'II\x2a\x00', 'image/tiff'],
[0L, 'string', '=', '\x89PNG', 'image/png'],
[0L, 'string', '=', '8BPS\ \ \000\000\000\000 &0xffffffff0000ffffffff', 'image/x-psd'],
[0L, 'string', '=', '#LyX', 'text/x-lyx'],
[0L, 'string', '=', 'DCMw', 'image/x-dcm'],
[0L, 'string', '=', 'gimp xcf', 'application/x-gimp-image'],
[0L, 'belong', '=', '0x59a66a95', 'image/x-sun-raster'],
[0L, 'belong', '=', '0x01da0000 &0xfcfeffff', 'image/x-sgi'],
[0L, 'belong', '=', '0xb168de3a', 'image/x-pcx'],
[0L, 'string', '=', '\x28\x00\x00\x00', 'image/x-dib'],
[0L, 'string', '=', 'SIMPLE =', 'image/x-fits'],
[0L, 'belong', '=', '0x46506978', 'image/x-fpx'],
[0L, 'belong', '=', '0x00000200', 'image/x-icb'],
[0L, 'belong', '=', '0x53445058', 'image/x-dpx'],
[0L, 'string', '=', '[Desktop Entry]', 'application/x-gnome-app-info'],
[0L, 'string', '=', '[X-GNOME-Metatheme]', 'application/x-gnome-theme'],
[0L, 'string', '=', '<nautilus_object nautilus_link', 'application/x-nautilus-link'],
[0L, 'string', '=', 'URL:', 'application/x-gmc-link'],
[0L, 'string', '=', '/* XPM */', 'image/x-xpixmap'],
[0L, 'string', '=', '<!DOCTYPE xbel', 'application/xbel'],
[0L, 'string', '=', '<xbel', 'application/xbel'],
[0L, 'string', '=', '<!DOCTYPE NETSCAPE-Bookmark-file-1\>', 'application/x-mozilla-bookmarks'],
[0L, 'string', '=', '<!DOCTYPE NETSCAPE-Bookmark-file-1\>', 'application/x-netscape-bookmarks'],
[0L, 'string', '=', '<ephy_bookmarks ', 'application/x-epiphany-bookmarks'],
[0L, 'string', '=', '<!DOCTYPE svg', 'image/svg'],
[0L, 'string', '=', '<svg', 'image/svg'],
[0L, 'string', '=', '<?php', 'application/x-php'],
[0L, 'string', '=', '<smil\>', 'application/x-smil'],
[0L, 'string', '=', '<SMIL\>', 'application/x-smil'],
[0L, 'string', '=', '<!DOCTYPE HTML', 'text/html'],
[0L, 'string', '=', '<!DOCTYPE html', 'text/html'],
[0L, 'string', '=', '<!doctype html', 'text/html'],
[0L, 'string', '=', '<!doctype Html', 'text/html'],
[0L, 'string', '=', '<!doctype HTML', 'text/html'],
[10L, 'string', '=', '<HEAD', 'text/html'],
[10L, 'string', '=', '<head', 'text/html'],
[16L, 'string', '=', '<TITLE', 'text/html'],
[16L, 'string', '=', '<title', 'text/html'],
[10L, 'string', '=', '<html', 'text/html'],
[0L, 'string', '=', '<HTML', 'text/html'],
[0L, 'string', '=', '<dia:diagram', 'application/x-dia-diagram'],
[0L, 'string', '=', '<abiword', 'application/x-abiword'],
[0L, 'string', '=', '<\!DOCTYPE abiword', 'application/x-abiword'],
[0L, 'string', '=', 'gmr:Workbook', 'application/x-gnumeric'],
[0L, 'string', '=', '<?xml', 'text/xml'],
[0L, 'string', '=', '{\\rtf', 'application/rtf'],
[0L, 'string', '=', '#!/bin/sh', 'text/x-sh'],
[0L, 'string', '=', '#!/bin/bash', 'text/x-sh'],
[0L, 'string', '=', '#!/bin/csh', 'text/x-csh'],
[0L, 'string', '=', '#!/bin/ksh', 'text/x-ksh'],
[0L, 'string', '=', '#!/bin/perl', 'text/x-perl'],
[0L, 'string', '=', '#!/bin/zsh', 'text/x-zsh'],
[1L, 'string', '=', '/bin/sh', 'text/x-sh'],
[1L, 'string', '=', '/bin/bash', 'text/x-sh'],
[1L, 'string', '=', '/bin/csh', 'text/x-csh'],
[1L, 'string', '=', '/bin/ksh', 'text/x-ksh'],
[1L, 'string', '=', '/bin/perl', 'text/x-perl'],
[0L, 'string', '=', 'BEGIN:VCARD', 'text/x-vcard'],
[0L, 'string', '=', 'BEGIN:VCALENDAR', 'text/calendar'],
[8L, 'string', '=', 'CDR vrsn', 'application/vnd.corel-draw'],
[8L, 'string', '=', 'AVI ', 'video/x-msvideo'],
[0L, 'string', '=', 'MOVI', 'video/x-sgi-movie'],
[0L, 'string', '=', '.snd', 'audio/basic'],
[8L, 'string', '=', 'AIFC', 'audio/x-aifc'],
[8L, 'string', '=', 'AIFF', 'audio/x-aiff'],
[0L, 'string', '=', '.ra\375', 'audio/x-pn-realaudio'],
[0L, 'belong', '=', '0x2e7261fd', 'audio/x-pn-realaudio'],
[0L, 'string', '=', '.RMF', 'audio/x-pn-realaudio'],
[8L, 'string', '=', 'WAVE', 'audio/x-wav'],
[8L, 'string', '=', 'WAV ', 'audio/x-wav'],
[0L, 'string', '=', 'ID3', 'audio/mpeg'],
[0L, 'string', '=', '0xfff0', 'audio/mpeg'],
[0L, 'string', '=', '\x00\x00\x01\xba', 'video/mpeg'],
[8L, 'string', '=', 'CDXA', 'video/mpeg'],
[0L, 'belong', '=', '0x000001ba', 'video/mpeg'],
[0L, 'belong', '=', '0x000001b3', 'video/mpeg'],
[0L, 'string', '=', 'RIFF', 'audio/x-riff'],
[0L, 'string', '=', 'OggS ', 'application/ogg'],
[0L, 'string', '=', 'pnm:\/\/', 'audio/x-real-audio'],
[0L, 'string', '=', 'rtsp:\/\/', 'audio/x-real-audio'],
[0L, 'string', '=', 'SIT!', 'application/x-stuffit'],
[0L, 'string', '=', '\312\376\272\276', 'application/x-java-byte-code'],
[0L, 'string', '=', 'Joy!', 'application/x-pef-executable'],
[4L, 'string', '=', '\x11\xAF', 'video/x-fli'],
[4L, 'string', '=', '\x12\xAF', 'video/x-flc'],
[0L, 'string', '=', '\x31\xbe\x00\x00', 'application/msword'],
[0L, 'string', '=', 'PO^Q`', 'application/msword'],
[0L, 'string', '=', '\376\067\0\043', 'application/msword'],
[0L, 'string', '=', '\320\317\021\340\241\261', 'application/msword'],
[0L, 'string', '=', '\333\245-\0\0\0', 'application/msword'],
[0L, 'string', '=', 'Microsoft Excel 5.0 Worksheet', 'application/vnd.ms-excel'],
[0L, 'string', '=', 'Biff5', 'application/vnd.ms-excel'],
[0L, 'string', '=', '*BEGIN SPREADSHEETS ', 'application/x-applix-spreadsheet'],
[0L, 'string', '=', '*BEGIN SPREADSHEETS ', 'application/x-applix-spreadsheet'],
[0L, 'string', '=', '\x00\x00\x02\x00', 'application/vnd.lotus-1-2-3'],
[0L, 'belong', '=', '0x00001a00', 'application/vnd.lotus-1-2-3'],
[0L, 'belong', '=', '0x00000200', 'application/vnd.lotus-1-2-3'],
[0L, 'string', '=', 'PSID', 'audio/prs.sid'],
[31L, 'string', '=', 'Oleo', 'application/x-oleo'],
[0L, 'string', '=', 'FFIL', 'application/x-font-ttf'],
[65L, 'string', '=', 'FFIL', 'application/x-font-ttf'],
[0L, 'string', '=', 'LWFN', 'application/x-font-type1'],
[65L, 'string', '=', 'LWFN', 'application/x-font-type1'],
[0L, 'string', '=', 'StartFont', 'application/x-font-sunos-news'],
[0L, 'string', '=', '\x13\x7A\x29', 'application/x-font-sunos-news'],
[8L, 'string', '=', '\x13\x7A\x2B', 'application/x-font-sunos-news'],
[0L, 'string', '=', '%!PS-AdobeFont-1.', 'application/x-font-type1'],
[6L, 'string', '=', '%!PS-AdobeFont-1.', 'application/x-font-type1'],
[0L, 'string', '=', '%!FontType1-1.', 'application/x-font-type1'],
[6L, 'string', '=', '%!FontType1-1.', 'application/x-font-type1'],
[0L, 'string', '=', 'STARTFONT\040', 'application/x-font-bdf'],
[0L, 'string', '=', '\001fcp', 'application/x-font-pcf'],
[0L, 'string', '=', 'D1.0\015', 'application/x-font-speedo'],
[0L, 'string', '=', '\x14\x02\x59\x19', 'application/x-font-libgrx'],
[0L, 'string', '=', '\xff\x46\x4f\x4e', 'application/x-font-dos'],
[7L, 'string', '=', '\x00\x45\x47\x41', 'application/x-font-dos'],
[7L, 'string', '=', '\x00\x56\x49\x44', 'application/x-font-dos'],
[0L, 'string', '=', '\<MakerScreenFont', 'application/x-font-framemaker'],
[0L, 'string', '=', '\000\001\000\000\000', 'application/x-font-ttf'],
[1L, 'string', '=', 'WPC', 'application/x-wordperfect'],
[0L, 'string', '=', 'ID;', 'text/spreadsheet'],
[0L, 'string', '=', 'MZ', 'application/x-ms-dos-executable'],
[0L, 'string', '=', '%!', 'application/postscript'],
[0L, 'string', '=', 'BZh', 'application/x-bzip'],
[0L, 'string', '=', '\x1f\x8b', 'application/x-gzip'],
[0L, 'string', '=', '\037\235', 'application/x-compress'],
[0L, 'string', '=', '\367\002', 'application/x-dvi'],
[0L, 'string', '=', '\367\203', 'application/x-font-tex'],
[0L, 'string', '=', '\367\131', 'application/x-font-tex'],
[0L, 'string', '=', '\367\312', 'application/x-font-tex'],
[2L, 'string', '=', '\000\022', 'application/x-font-tex-tfm'],
[0L, 'string', '=', '\x36\x04', 'application/x-font-linux-psf'],
[0L, 'string', '=', 'FWS', 'application/x-shockwave-flash'],
[0L, 'string', '=', 'CWS', 'application/x-shockwave-flash'],
[0L, 'string', '=', 'NSVf', 'video/x-nsv'],
[0L, 'string', '=', 'BMxxxx\000\000 &0xffff00000000ffff', 'image/bmp'],
[0L, 'string', '=', 'Return-Path:', 'message/rfc822'],
[0L, 'string', '=', 'Path:', 'message/news'],
[0L, 'string', '=', 'Xref:', 'message/news'],
[0L, 'string', '=', 'From:', 'message/rfc822'],
[0L, 'string', '=', 'Article', 'message/news'],
[0L, 'string', '=', 'Received:', 'message/rfc822'],
[0L, 'string', '=', '[playlist]', 'audio/x-scpls'],
[0L, 'string', '=', '[Reference]', 'video/x-ms-asf'],
[0L, 'string', '=', 'fLaC', 'application/x-flac'],
[32769L, 'string', '=', 'CD001', 'application/x-iso-image'],
[37633L, 'string', '=', 'CD001', 'application/x-iso-image'],
[32776L, 'string', '=', 'CDROM', 'application/x-iso-image'],
[0L, 'string', '=', 'OTTO', 'application/x-font-otf'],
[54L, 'string', '=', 'S T O P', 'application/x-ipod-firmware'],
[0L, 'string', '=', 'BLENDER', 'application/x-blender'],
[20L, 'string', '=', 'import', 'text/python-source'],
]
magicNumbers = []
def strToNum(n):
val = 0
col = long(1)
if n[:1] == 'x': n = '0' + n
if n[:2] == '0x':
# hex
n = string.lower(n[2:])
while len(n) > 0:
l = n[len(n) - 1]
val = val + string.hexdigits.index(l) * col
col = col * 16
n = n[:len(n)-1]
elif n[0] == '\\':
# octal
n = n[1:]
while len(n) > 0:
l = n[len(n) - 1]
if ord(l) < 48 or ord(l) > 57: break
val = val + int(l) * col
col = col * 8
n = n[:len(n)-1]
else:
val = string.atol(n)
return val
class magicTest:
def __init__(self, offset, t, op, value, msg, mask = None):
if t.count('&') > 0:
mask = strToNum(t[t.index('&')+1:])
t = t[:t.index('&')]
if type(offset) == type('a'):
self.offset = strToNum(offset)
else:
self.offset = offset
self.type = t
self.msg = msg
self.subTests = []
self.op = op
self.mask = mask
self.value = value
def test(self, data):
if self.mask:
data = data & self.mask
if self.op == '=':
if self.value == data: return self.msg
elif self.op == '<':
pass
elif self.op == '>':
pass
elif self.op == '&':
pass
elif self.op == '^':
pass
return None
def compare(self, data):
#print str([self.type, self.value, self.msg])
try:
if self.type == 'string':
c = ''; s = ''
for i in range(0, len(self.value)+1):
if i + self.offset > len(data) - 1: break
s = s + c
[c] = struct.unpack('c', data[self.offset + i])
data = s
elif self.type == 'short':
[data] = struct.unpack('h', data[self.offset : self.offset + 2])
elif self.type == 'leshort':
[data] = struct.unpack('<h', data[self.offset : self.offset + 2])
elif self.type == 'beshort':
[data] = struct.unpack('>H', data[self.offset : self.offset + 2])
elif self.type == 'long':
[data] = struct.unpack('l', data[self.offset : self.offset + 4])
elif self.type == 'lelong':
[data] = struct.unpack('<l', data[self.offset : self.offset + 4])
elif self.type == 'belong':
[data] = struct.unpack('>l', data[self.offset : self.offset + 4])
else:
#print 'UNKNOWN TYPE: ' + self.type
pass
except:
return None
# print str([self.msg, self.value, data])
return self.test(data)
def guessMime(data):
for test in magicNumbers:
m = test.compare(data)
if m:
return m
# no matching, magic number.
return
#import sys
for m in magic:
magicNumbers.append(magicTest(m[0], m[1], m[2], m[3], m[4]))
application/activemessage
application/andrew-inset ez
application/applefile
application/atomicmail
application/batch-SMTP
application/beep+xml
application/cals-1840
application/commonground
application/cu-seeme cu
application/cybercash
application/dca-rft
application/dec-dx
application/docbook+xml
application/dsptype tsp
application/dvcs
application/edi-consent
application/edi-x12
application/edifact
application/eshop
application/font-tdpfr
application/futuresplash spl
application/ghostview
application/hta hta
application/http
application/hyperstudio
application/iges
application/index
application/index.cmd
application/index.obj
application/index.response
application/index.vnd
application/iotp
application/ipp
application/isup
application/java-archive jar
application/java-serialized-object ser
application/java-vm class
application/mac-binhex40 hqx
application/mac-compactpro cpt
application/macwriteii
application/marc
application/mathematica nb
application/mathematica-old
application/msaccess mdb
application/msword doc dot
application/news-message-id
application/news-transmission
application/ocsp-request
application/ocsp-response
application/octet-stream bin
application/oda oda
application/ogg ogg
application/parityfec
application/pdf pdf
application/pgp-encrypted
application/pgp-keys key
application/pgp-signature pgp
application/pics-rules prf
application/pkcs10
application/pkcs7-mime
application/pkcs7-signature
application/pkix-cert
application/pkix-crl
application/pkixcmp
application/postscript ps ai eps
application/prs.alvestrand.titrax-sheet
application/prs.cww
application/prs.nprend
application/qsig
application/rar rar
application/rdf+xml rdf
application/remote-printing
application/riscos
application/rss+xml rss
application/rtf
application/sdp
application/set-payment
application/set-payment-initiation
application/set-registration
application/set-registration-initiation
application/sgml
application/sgml-open-catalog
application/sieve
application/slate
application/smil smi smil
application/timestamp-query
application/timestamp-reply
application/vemmi
application/whoispp-query
application/whoispp-response
application/wita
application/wordperfect wpd
application/wordperfect5.1 wp5
application/x400-bp
application/xhtml+xml xhtml xht
application/xml xml xsl
application/xml-dtd
application/xml-external-parsed-entity
application/zip zip
application/vnd.3M.Post-it-Notes
application/vnd.accpac.simply.aso
application/vnd.accpac.simply.imp
application/vnd.acucobol
application/vnd.aether.imp
application/vnd.anser-web-certificate-issue-initiation
application/vnd.anser-web-funds-transfer-initiation
application/vnd.audiograph
application/vnd.bmi
application/vnd.businessobjects
application/vnd.canon-cpdl
application/vnd.canon-lips
application/vnd.cinderella cdy
application/vnd.claymore
application/vnd.commerce-battelle
application/vnd.commonspace
application/vnd.comsocaller
application/vnd.contact.cmsg
application/vnd.cosmocaller
application/vnd.ctc-posml
application/vnd.cups-postscript
application/vnd.cups-raster
application/vnd.cups-raw
application/vnd.cybank
application/vnd.dna
application/vnd.dpgraph
application/vnd.dxr
application/vnd.ecdis-update
application/vnd.ecowin.chart
application/vnd.ecowin.filerequest
application/vnd.ecowin.fileupdate
application/vnd.ecowin.series
application/vnd.ecowin.seriesrequest
application/vnd.ecowin.seriesupdate
application/vnd.enliven
application/vnd.epson.esf
application/vnd.epson.msf
application/vnd.epson.quickanime
application/vnd.epson.salt
application/vnd.epson.ssf
application/vnd.ericsson.quickcall
application/vnd.eudora.data
application/vnd.fdf
application/vnd.ffsns
application/vnd.flographit
application/vnd.framemaker
application/vnd.fsc.weblaunch
application/vnd.fujitsu.oasys
application/vnd.fujitsu.oasys2
application/vnd.fujitsu.oasys3
application/vnd.fujitsu.oasysgp
application/vnd.fujitsu.oasysprs
application/vnd.fujixerox.ddd
application/vnd.fujixerox.docuworks
application/vnd.fujixerox.docuworks.binder
application/vnd.fut-misnet
application/vnd.grafeq
application/vnd.groove-account
application/vnd.groove-identity-message
application/vnd.groove-injector
application/vnd.groove-tool-message
application/vnd.groove-tool-template
application/vnd.groove-vcard
application/vnd.hhe.lesson-player
application/vnd.hp-HPGL
application/vnd.hp-PCL
application/vnd.hp-PCLXL
application/vnd.hp-hpid
application/vnd.hp-hps
application/vnd.httphone
application/vnd.hzn-3d-crossword
application/vnd.ibm.MiniPay
application/vnd.ibm.afplinedata
application/vnd.ibm.modcap
application/vnd.informix-visionary
application/vnd.intercon.formnet
application/vnd.intertrust.digibox
application/vnd.intertrust.nncp
application/vnd.intu.qbo
application/vnd.intu.qfx
application/vnd.irepository.package+xml
application/vnd.is-xpr
application/vnd.japannet-directory-service
application/vnd.japannet-jpnstore-wakeup
application/vnd.japannet-payment-wakeup
application/vnd.japannet-registration
application/vnd.japannet-registration-wakeup
application/vnd.japannet-setstore-wakeup
application/vnd.japannet-verification
application/vnd.japannet-verification-wakeup
application/vnd.koan
application/vnd.lotus-1-2-3
application/vnd.lotus-approach
application/vnd.lotus-freelance
application/vnd.lotus-notes
application/vnd.lotus-organizer
application/vnd.lotus-screencam
application/vnd.lotus-wordpro
application/vnd.mcd
application/vnd.mediastation.cdkey
application/vnd.meridian-slingshot
application/vnd.mif
application/vnd.minisoft-hp3000-save
application/vnd.mitsubishi.misty-guard.trustweb
application/vnd.mobius.daf
application/vnd.mobius.dis
application/vnd.mobius.msl
application/vnd.mobius.plc
application/vnd.mobius.txf
application/vnd.motorola.flexsuite
application/vnd.motorola.flexsuite.adsi
application/vnd.motorola.flexsuite.fis
application/vnd.motorola.flexsuite.gotap
application/vnd.motorola.flexsuite.kmr
application/vnd.motorola.flexsuite.ttc
application/vnd.motorola.flexsuite.wem
application/vnd.mozilla.xul+xml xul
application/vnd.ms-artgalry
application/vnd.ms-asf
application/vnd.ms-excel xls xlb xlt
application/vnd.ms-lrm
application/vnd.ms-pki.seccat cat
application/vnd.ms-pki.stl stl
application/vnd.ms-powerpoint ppt pps
application/vnd.ms-project
application/vnd.ms-tnef
application/vnd.ms-works
application/vnd.mseq
application/vnd.msign
application/vnd.music-niff
application/vnd.musician
application/vnd.netfpx
application/vnd.noblenet-directory
application/vnd.noblenet-sealer
application/vnd.noblenet-web
application/vnd.novadigm.EDM
application/vnd.novadigm.EDX
application/vnd.novadigm.EXT
application/vnd.oasis.opendocument.chart odc
application/vnd.oasis.opendocument.database odb
application/vnd.oasis.opendocument.formula odf
application/vnd.oasis.opendocument.graphics odg
application/vnd.oasis.opendocument.graphics-template otg
application/vnd.oasis.opendocument.image odi
application/vnd.oasis.opendocument.presentation odp
application/vnd.oasis.opendocument.presentation-template otp
application/vnd.oasis.opendocument.spreadsheet ods
application/vnd.oasis.opendocument.spreadsheet-template ots
application/vnd.oasis.opendocument.text odt
application/vnd.oasis.opendocument.text-master odm
application/vnd.oasis.opendocument.text-template ott
application/vnd.oasis.opendocument.text-web oth
application/vnd.osa.netdeploy
application/vnd.palm
application/vnd.pg.format
application/vnd.pg.osasli
application/vnd.powerbuilder6
application/vnd.powerbuilder6-s
application/vnd.powerbuilder7
application/vnd.powerbuilder7-s
application/vnd.powerbuilder75
application/vnd.powerbuilder75-s
application/vnd.previewsystems.box
application/vnd.publishare-delta-tree
application/vnd.pvi.ptid1
application/vnd.pwg-xhtml-print+xml
application/vnd.rapid
application/vnd.rim.cod cod
application/vnd.s3sms
application/vnd.seemail
application/vnd.shana.informed.formdata
application/vnd.shana.informed.formtemplate
application/vnd.shana.informed.interchange
application/vnd.shana.informed.package
application/vnd.smaf mmf
application/vnd.sss-cod
application/vnd.sss-dtf
application/vnd.sss-ntf
application/vnd.stardivision.calc sdc
application/vnd.stardivision.draw sda
application/vnd.stardivision.impress sdd sdp
application/vnd.stardivision.math smf
application/vnd.stardivision.writer sdw vor
application/vnd.stardivision.writer-global sgl
application/vnd.street-stream
application/vnd.sun.xml.calc sxc
application/vnd.sun.xml.calc.template stc
application/vnd.sun.xml.draw sxd
application/vnd.sun.xml.draw.template std
application/vnd.sun.xml.impress sxi
application/vnd.sun.xml.impress.template sti
application/vnd.sun.xml.math sxm
application/vnd.sun.xml.writer sxw
application/vnd.sun.xml.writer.global sxg
application/vnd.sun.xml.writer.template stw
application/vnd.svd
application/vnd.swiftview-ics
application/vnd.symbian.install sis
application/vnd.triscape.mxs
application/vnd.trueapp
application/vnd.truedoc
application/vnd.tve-trigger
application/vnd.ufdl
application/vnd.uplanet.alert
application/vnd.uplanet.alert-wbxml
application/vnd.uplanet.bearer-choice
application/vnd.uplanet.bearer-choice-wbxml
application/vnd.uplanet.cacheop
application/vnd.uplanet.cacheop-wbxml
application/vnd.uplanet.channel
application/vnd.uplanet.channel-wbxml
application/vnd.uplanet.list
application/vnd.uplanet.list-wbxml
application/vnd.uplanet.listcmd
application/vnd.uplanet.listcmd-wbxml
application/vnd.uplanet.signal
application/vnd.vcx
application/vnd.vectorworks
application/vnd.vidsoft.vidconference
application/vnd.visio vsd
application/vnd.vividence.scriptfile
application/vnd.wap.sic
application/vnd.wap.slc
application/vnd.wap.wbxml wbxml
application/vnd.wap.wmlc wmlc
application/vnd.wap.wmlscriptc wmlsc
application/vnd.webturbo
application/vnd.wrq-hp3000-labelled
application/vnd.wt.stf
application/vnd.xara
application/vnd.xfdl
application/vnd.yellowriver-custom-menu
application/x-123 wk
application/x-abiword abw
application/x-apple-diskimage dmg
application/x-bcpio bcpio
application/x-bittorrent torrent
application/x-cdf cdf
application/x-cdlink vcd
application/x-chess-pgn pgn
application/x-core
application/x-cpio cpio
application/x-csh csh
application/x-debian-package deb udeb
application/x-director dcr dir dxr
application/x-dms dms
application/x-doom wad
application/x-dvi dvi
application/x-executable
application/x-flac flac
application/x-font pfa pfb gsf pcf pcf.Z
application/x-freemind mm
application/x-futuresplash spl
application/x-gnumeric gnumeric
application/x-go-sgf sgf
application/x-graphing-calculator gcf
application/x-gtar gtar tgz taz
application/x-hdf hdf
application/x-httpd-php phtml pht php
application/x-httpd-php-source phps
application/x-httpd-php3 php3
application/x-httpd-php3-preprocessed php3p
application/x-httpd-php4 php4
application/x-ica ica
application/x-internet-signup ins isp
application/x-iphone iii
application/x-iso9660-image iso
application/x-java-applet
application/x-java-bean
application/x-java-jnlp-file jnlp
application/x-javascript js
application/x-jmol jmz
application/x-kchart chrt
application/x-kdelnk
application/x-killustrator kil
application/x-koan skp skd skt skm
application/x-kpresenter kpr kpt
application/x-kspread ksp
application/x-kword kwd kwt
application/x-latex latex
application/x-lha lha
application/x-lzh lzh
application/x-lzx lzx
application/x-maker frm maker frame fm fb book fbdoc
application/x-mif mif
application/x-ms-wmd wmd
application/x-ms-wmz wmz
application/x-msdos-program com exe bat dll
application/x-msi msi
application/x-netcdf nc
application/x-ns-proxy-autoconfig pac
application/x-nwc nwc
application/x-object o
application/x-oz-application oza
application/x-pkcs7-certreqresp p7r
application/x-pkcs7-crl crl
application/x-python-code pyc pyo
application/x-quicktimeplayer qtl
application/x-redhat-package-manager rpm
application/x-rx
application/x-sh sh
application/x-shar shar
application/x-shellscript
application/x-shockwave-flash swf swfl
application/x-stuffit sit
application/x-sv4cpio sv4cpio
application/x-sv4crc sv4crc
application/x-tar tar
application/x-tcl tcl
application/x-tex-gf gf
application/x-tex-pk pk
application/x-texinfo texinfo texi
application/x-trash ~ % bak old sik
application/x-troff t tr roff
application/x-troff-man man
application/x-troff-me me
application/x-troff-ms ms
application/x-ustar ustar
application/x-videolan
application/x-wais-source src
application/x-wingz wz
application/x-x509-ca-cert crt
application/x-xcf xcf
application/x-xfig fig
application/x-xpinstall xpi
audio/32kadpcm
audio/basic au snd
audio/g.722.1
audio/l16
audio/midi mid midi kar
audio/mp4a-latm
audio/mpa-robust
audio/mpeg mpga mpega mp2 mp3 m4a
audio/mpegurl m3u
audio/parityfec
audio/prs.sid sid
audio/telephone-event
audio/tone
audio/vnd.cisco.nse
audio/vnd.cns.anp1
audio/vnd.cns.inf1
audio/vnd.digital-winds
audio/vnd.everad.plj
audio/vnd.lucent.voice
audio/vnd.nortel.vbk
audio/vnd.nuera.ecelp4800
audio/vnd.nuera.ecelp7470
audio/vnd.nuera.ecelp9600
audio/vnd.octel.sbc
audio/vnd.qcelp
audio/vnd.rhetorex.32kadpcm
audio/vnd.vmx.cvsd
audio/x-aiff aif aiff aifc
audio/x-gsm gsm
audio/x-mpegurl m3u
audio/x-ms-wma wma
audio/x-ms-wax wax
audio/x-pn-realaudio-plugin
audio/x-pn-realaudio ra rm ram
audio/x-realaudio ra
audio/x-scpls pls
audio/x-sd2 sd2
audio/x-wav wav
chemical/x-alchemy alc
chemical/x-cache cac cache
chemical/x-cache-csf csf
chemical/x-cactvs-binary cbin cascii ctab
chemical/x-cdx cdx
chemical/x-cerius cer
chemical/x-chem3d c3d
chemical/x-chemdraw chm
chemical/x-cif cif
chemical/x-cmdf cmdf
chemical/x-cml cml
chemical/x-compass cpa
chemical/x-crossfire bsd
chemical/x-csml csml csm
chemical/x-ctx ctx
chemical/x-cxf cxf cef
#chemical/x-daylight-smiles smi
chemical/x-embl-dl-nucleotide emb embl
chemical/x-galactic-spc spc
chemical/x-gamess-input inp gam gamin
chemical/x-gaussian-checkpoint fch fchk
chemical/x-gaussian-cube cub
chemical/x-gaussian-input gau gjc gjf
chemical/x-gaussian-log gal
chemical/x-gcg8-sequence gcg
chemical/x-genbank gen
chemical/x-hin hin
chemical/x-isostar istr ist
chemical/x-jcamp-dx jdx dx
chemical/x-kinemage kin
chemical/x-macmolecule mcm
chemical/x-macromodel-input mmd mmod
chemical/x-mdl-molfile mol
chemical/x-mdl-rdfile rd
chemical/x-mdl-rxnfile rxn
chemical/x-mdl-sdfile sd sdf
chemical/x-mdl-tgf tgf
#chemical/x-mif mif
chemical/x-mmcif mcif
chemical/x-mol2 mol2
chemical/x-molconn-Z b
chemical/x-mopac-graph gpt
chemical/x-mopac-input mop mopcrt mpc dat zmt
chemical/x-mopac-out moo
chemical/x-mopac-vib mvb
chemical/x-ncbi-asn1 asn
chemical/x-ncbi-asn1-ascii prt ent
chemical/x-ncbi-asn1-binary val aso
chemical/x-ncbi-asn1-spec asn
chemical/x-pdb pdb ent
chemical/x-rosdal ros
chemical/x-swissprot sw
chemical/x-vamas-iso14976 vms
chemical/x-vmd vmd
chemical/x-xtel xtel
chemical/x-xyz xyz
image/cgm
image/g3fax
image/gif gif
image/ief ief
image/jpeg jpeg jpg jpe
image/naplps
image/pcx pcx
image/png png
image/prs.btif
image/prs.pti
image/svg+xml svg svgz
image/tiff tiff tif
image/vnd.cns.inf2
image/vnd.djvu djvu djv
image/vnd.dwg
image/vnd.dxf
image/vnd.fastbidsheet
image/vnd.fpx
image/vnd.fst
image/vnd.fujixerox.edmics-mmr
image/vnd.fujixerox.edmics-rlc
image/vnd.mix
image/vnd.net-fpx
image/vnd.svf
image/vnd.wap.wbmp wbmp
image/vnd.xiff
image/x-cmu-raster ras
image/x-coreldraw cdr
image/x-coreldrawpattern pat
image/x-coreldrawtemplate cdt
image/x-corelphotopaint cpt
image/x-icon ico
image/x-jg art
image/x-jng jng
image/x-ms-bmp bmp
image/x-photoshop psd
image/x-portable-anymap pnm
image/x-portable-bitmap pbm
image/x-portable-graymap pgm
image/x-portable-pixmap ppm
image/x-rgb rgb
image/x-xbitmap xbm
image/x-xpixmap xpm
image/x-xwindowdump xwd
inode/chardevice
inode/blockdevice
inode/directory-locked
inode/directory
inode/fifo
inode/socket
message/delivery-status
message/disposition-notification
message/external-body
message/http
message/s-http
message/news
message/partial
message/rfc822
model/iges igs iges
model/mesh msh mesh silo
model/vnd.dwf
model/vnd.flatland.3dml
model/vnd.gdl
model/vnd.gs-gdl
model/vnd.gtw
model/vnd.mts
model/vnd.vtu
model/vrml wrl vrml
multipart/alternative
multipart/appledouble
multipart/byteranges
multipart/digest
multipart/encrypted
multipart/form-data
multipart/header-set
multipart/mixed
multipart/parallel
multipart/related
multipart/report
multipart/signed
multipart/voice-message
text/calendar ics icz
text/comma-separated-values csv
text/css css
text/directory
text/english
text/enriched
text/h323 323
text/html html htm shtml
text/iuls uls
text/mathml mml
text/parityfec
text/plain asc txt text diff pot
text/prs.lines.tag
text/rfc822-headers
text/richtext rtx
text/rtf rtf
text/scriptlet sct wsc
text/t140
text/texmacs tm ts
text/tab-separated-values tsv
text/uri-list
text/vnd.abc
text/vnd.curl
text/vnd.DMClientScript
text/vnd.flatland.3dml
text/vnd.fly
text/vnd.fmi.flexstor
text/vnd.in3d.3dml
text/vnd.in3d.spot
text/vnd.IPTC.NewsML
text/vnd.IPTC.NITF
text/vnd.latex-z
text/vnd.motorola.reflex
text/vnd.ms-mediapackage
text/vnd.sun.j2me.app-descriptor jad
text/vnd.wap.si
text/vnd.wap.sl
text/vnd.wap.wml wml
text/vnd.wap.wmlscript wmls
text/x-bibtex bib
text/x-c++hdr h++ hpp hxx hh
text/x-c++src c++ cpp cxx cc
text/x-chdr h
text/x-crontab
text/x-csh csh
text/x-csrc c
text/x-dsrc d
text/x-haskell hs
text/x-java java
text/x-literate-haskell lhs
text/x-makefile
text/x-moc moc
text/x-pascal p pas
text/x-pcs-gcd gcd
text/x-perl pl pm
text/x-python py
text/x-server-parsed-html
text/x-setext etx
text/x-sh sh
text/x-tcl tcl tk
text/x-tex tex ltx sty cls
text/x-vcalendar vcs
text/x-vcard vcf
video/dl dl
video/dv dif dv
video/fli fli
video/gl gl
video/mpeg mpeg mpg mpe
video/mp4 mp4
video/quicktime qt mov
video/mp4v-es
video/parityfec
video/pointer
video/vnd.fvt
video/vnd.motorola.video
video/vnd.motorola.videop
video/vnd.mpegurl mxu
video/vnd.mts
video/vnd.nokia.interleaved-multimedia
video/vnd.vivo
video/x-la-asf lsf lsx
video/x-mng mng
video/x-ms-asf asf asx
video/x-ms-wm wm
video/x-ms-wmv wmv
video/x-ms-wmx wmx
video/x-ms-wvx wvx
video/x-msvideo avi
video/x-sgi-movie movie
x-conference/x-cooltalk ice
x-world/x-vrml vrm vrml wrl
from Products.MimetypesRegistry.interfaces import IClassifier
from Products.MimetypesRegistry.MimeTypeItem import MimeTypeItem
from Products.MimetypesRegistry.common import MimeTypeException
from types import InstanceType
import re
class text_plain(MimeTypeItem):
__implements__ = MimeTypeItem.__implements__
__name__ = "Plain Text"
mimetypes = ('text/plain',)
extensions = ('txt',)
binary = 0
class text_pre_plain(MimeTypeItem):
__implements__ = MimeTypeItem.__implements__
__name__ = "Pre-formatted Text (<pre>)"
mimetypes = ('text/plain-pre',)
extensions = ()
binary = 0
class text_structured(MimeTypeItem):
__implements__ = MimeTypeItem.__implements__
__name__ = "Structured Text"
mimetypes = ('text/structured',)
extensions = ('stx',)
binary = 0
class text_rest(MimeTypeItem):
__implements__ = MimeTypeItem.__implements__
__name__ = "reStructured Text"
mimetypes = ("text/x-rst", "text/restructured",)
extensions = ("rst", "rest", "restx") #txt?
binary = 0
class text_python(MimeTypeItem):
__implements__ = MimeTypeItem.__implements__
__name__ = "Python Source"
mimetypes = ("text/python-source", "text/x-python",)
extensions = ("py",)
binary = 0
class text_wiki(MimeTypeItem):
__implements__ = MimeTypeItem.__implements__
__name__ = "Wiki text"
mimetypes = ("text/wiki",)
extensions = ()
binary = 0
class application_rtf(MimeTypeItem):
__implements__ = MimeTypeItem.__implements__
__name__ = 'Rich Text Format (RTF)'
mimetypes = ('application/rtf',)
extensions = ('rtf',)
binary = 1
class application_msword(MimeTypeItem):
__implements__ = MimeTypeItem.__implements__
__name__ = "Microsoft Word Document"
mimetypes = ('application/msword',)
extensions = ('doc',)
binary = 1
class text_xml(MimeTypeItem):
__implements__ = MimeTypeItem.__implements__ + (IClassifier,)
__name__ = "Extensible Markup Language (XML)"
mimetypes = ('text/xml',)
extensions = ('xml',)
binary = 0
def classify(self, data):
m = re.search('^\s*<\\?xml.*\\?>', data)
if m:
return 1 # True
return None # False
class application_octet_stream(MimeTypeItem):
"""we need to be sure this one exists"""
__name__ = "Octet Stream"
mimetypes = ('application/octet-stream',)
binary = 1
extensions = ()
class text_html(MimeTypeItem):
__implements__ = MimeTypeItem.__implements__
__name__ = "HTML"
mimetypes = ('text/html',)
extensions = ('html', 'htm')
binary = 0
class text_html_safe(MimeTypeItem):
__implements__ = MimeTypeItem.__implements__
__name__ = "Safe HTML"
mimetypes = ('text/x-html-safe',)
extensions = ()
binary = 0
reg_types = [
text_plain,
text_pre_plain,
application_msword,
text_xml,
text_structured,
text_rest,
text_python,
text_wiki,
application_octet_stream,
application_rtf,
text_html,
text_html_safe,
]
def initialize(registry):
for mt in reg_types:
if type(mt) != InstanceType:
mt = mt()
registry.register(mt)
__all__ = tuple([cls.__name__ for cls in reg_types])
import os.path
from Products.MimetypesRegistry.MimeTypeItem import MimeTypeItem
from Products.MimetypesRegistry.MimeTypeItem import guess_icon_path
from Products.MimetypesRegistry.common import MimeTypeException
try:
from zope.contenttype import add_files
except ImportError: # BBB: Zope < 2.10
try:
from zope.app.content_types import add_files
except ImportError: # BBB: Zope < 2.9
from OFS.content_types import add_files
import mimetypes as pymimetypes
mimes_initialized = False
def mimes_initialize():
global mimes_initialized
if mimes_initialized:
return
mimes_initialized = True
# Augment known mime-types.
here = os.path.dirname(os.path.abspath(__file__))
add_files([os.path.join(here, 'mime.types')])
# don't register the mimetype from python mimetypes if matching on of
# this extensions.
skip_extensions = (
)
def initialize(registry):
# Find things that are not in the specially registered mimetypes
# and add them using some default policy, none of these will impl
# iclassifier
# Read our included mime.types file, in addition to whatever the
# mimetypes python module might have found.
mimes_initialize()
# Initialize from registry known mimetypes if we are on Windows
# and pywin32 is available.
try:
from windows_mimetypes import initialize
initialize()
except ImportError:
pass
for ext, mt in pymimetypes.types_map.items():
if ext[0] == '.':
ext = ext[1:]
if registry.lookupExtension(ext):
continue
if ext in skip_extensions:
continue
try:
mto = registry.lookup(mt)
except MimeTypeException:
# malformed MIME type
continue
if mto:
mto = mto[0]
if not ext in mto.extensions:
registry.register_extension(ext, mto)
mto.extensions += (ext, )
# here we guess icon path again, to find icon match the new ext
mto.icon_path = guess_icon_path(mto)
continue
isBin = mt.split('/', 1)[0] != "text"
registry.register(MimeTypeItem(mt, (mt,), (ext,), isBin))
import os
from xml.sax import parse
from xml.sax.handler import ContentHandler
DIR = os.path.dirname(__file__)
SMI_NAME = "freedesktop.org.xml"
SMI_FILE = os.path.join(DIR, SMI_NAME)
class SharedMimeInfoHandler(ContentHandler):
current = None
collect_comment = None
def __init__(self):
ContentHandler.__init__(self)
self.mimes = []
def startElement(self, name, attrs):
if name in ('mime-type',):
current = {'type': attrs['type'],
'comments': {},
'globs': [],
'aliases': []}
self.mimes.append(current)
self.current = current
return
if name in ('comment',):
# If no lang, assume 'en'
lang = attrs.get('xml:lang', 'en')
if lang not in ('en',):
# Ignore for now.
return
self.__comment_buffer = []
self.__comment_lang = lang
self.collect_comment = True
return
if name in ('glob',):
globs = self.current['globs']
globs.append(attrs['pattern'])
return
if name in ('alias',):
aliases = self.current['aliases']
aliases.append(attrs['type'])
def endElement(self, name):
if self.collect_comment and name in ('comment',):
self.collect_comment = False
lang = self.__comment_lang
comment = u''.join(self.__comment_buffer)
if not comment:
comment = self.current['type']
self.current['comments'][lang] = comment
def characters(self, contents):
if self.collect_comment:
self.__comment_buffer.append(contents)
def readSMIFile(infofile):
"""Reads a shared mime info XML file
"""
handler = SharedMimeInfoHandler()
parse(infofile, handler)
return handler.mimes
mimetypes = readSMIFile(SMI_FILE)
def initialize(registry):
global mimetypes
from Products.MimetypesRegistry.MimeTypeItem import MimeTypeItem
from Products.MimetypesRegistry.common import MimeTypeException
# Find things that are not in the specially registered mimetypes
# and add them using some default policy, none of these will impl
# iclassifier
for res in mimetypes:
mt = str(res['type'])
mts = (mt,) + tuple(res['aliases'])
# check the mime type
try:
mto = registry.lookup(mt)
except MimeTypeException:
# malformed MIME type
continue
name = str(res['comments'].get(u'en', mt))
# build a list of globs
globs = []
for glob in res['globs']:
if registry.lookupGlob(glob):
continue
else:
globs.append(glob)
if mto:
mto = mto[0]
for glob in globs:
if not glob in mto.globs:
mto.globs = list(mto.globs) + [glob]
registry.register_glob(glob, mto)
for mt in mts:
if not mt in mto.mimetypes:
mto.mimetypes = list(mto.mimetypes) + [mt]
registry.register_mimetype(mt, mto)
else:
isBin = mt.split('/', 1)[0] != "text"
mti = MimeTypeItem(name, mimetypes=mts,
binary=isBin,
globs=globs)
registry.register(mti)
from Products.MimetypesRegistry.MimeTypeItem import MimeTypeItem
from Products.MimetypesRegistry.common import MimeTypeException
map = {
# '.extension' : 'mimetype',
'.svg' : 'image/svg+xml', # scaleable vector graphics
'.pjpg' : 'image/pjpeg', # scaleable vector graphics
}
def initialize(registry):
#Find things that are not in the specially registered mimetypes
#and add them using some default policy, none of these will impl
#iclassifier
for ext, mt in map.items():
if ext[0] == '.':
ext = ext[1:]
if registry.lookupExtension(ext):
continue
try:
mto = registry.lookup(mt)
except MimeTypeException:
# malformed MIME type
continue
if mto:
mto = mto[0]
if not ext in mto.extensions:
registry.register_extension(ext, mto)
mto.extensions += (ext, )
continue
isBin = mt.split('/', 1)[0] != "text"
registry.register(MimeTypeItem(mt, (mt,), (ext,), isBin))
# Utilities for mime-types and the Windows registry.
import _winreg
import win32api
import win32con
import mimetypes
import logging
logger = logging.getLogger('mimetypes.win32')
# "safely" query a value, returning a default when it doesn't exist.
def _RegQueryValue(key, value, default=None):
try:
data, typ = win32api.RegQueryValueEx(key, value)
except win32api.error:
return default
if typ == win32con.REG_EXPAND_SZ:
data = win32api.ExpandEnvironmentStrings(data)
if type in (win32con.REG_EXPAND_SZ, win32con.REG_SZ):
# Occasionally see trailing \0 chars.
data = data.rstrip('\0')
return data
def get_desc_for_mimetype(mime_type):
try:
hk = win32api.RegOpenKey(win32con.HKEY_CLASSES_ROOT,
r"MIME\Database\Content Type\\" + mime_type)
desc = _RegQueryValue(hk, "")
except win32api.error, details:
logger.info("win32api error fetching description for mime-type %r: %s",
mime_type, details)
desc = None
logger.debug("mime-type %s has description %s", mime_type, desc)
return desc
def get_ext_for_mimetype(mime_type):
try:
hk = win32api.RegOpenKey(win32con.HKEY_CLASSES_ROOT,
r"MIME\Database\Content Type\\" + mime_type)
ext = _RegQueryValue(hk, "Extension")
except win32api.error, details:
logger.info("win32api error fetching extension for mime-type %r: %s",
mime_type, details)
ext = None
logger.debug("mime-type %s has extension %s", mime_type, ext)
return ext
def get_mime_types():
try:
hk = win32api.RegOpenKey(win32con.HKEY_CLASSES_ROOT,
r"MIME\Database\Content Type")
items = win32api.RegEnumKeyEx(hk)
except win32api.error, details:
logger.info("win32api error fetching mimetypes: %s",
details)
items = []
return [i[0] for i in items if i[0]]
def normalize(mt):
# Some mimetypes might have extra ';q=value' params.
return mt.lower().split(';')[0]
def initialize():
if not mimetypes.inited:
mimetypes.init()
for mt in get_mime_types():
ext = get_ext_for_mimetype(mt)
if ext is None:
continue
if not mimetypes.types_map.has_key(ext):
mimetypes.add_type(normalize(mt), ext)
if __name__=='__main__':
for mt in get_mime_types():
ext = get_ext_for_mimetype(mt)
desc = get_desc_for_mimetype(mt)
print "%s (%s) - %s" % (mt.lower(), desc, ext)
import code; code.interact(local=locals())
##############################################################################
#
# ZopeTestCase
#
# COPY THIS FILE TO YOUR 'tests' DIRECTORY.
#
# This version of framework.py will use the SOFTWARE_HOME
# environment variable to locate Zope and the Testing package.
#
# If the tests are run in an INSTANCE_HOME installation of Zope,
# Products.__path__ and sys.path with be adjusted to include the
# instance's Products and lib/python directories respectively.
#
# If you explicitly set INSTANCE_HOME prior to running the tests,
# auto-detection is disabled and the specified path will be used
# instead.
#
# If the 'tests' directory contains a custom_zodb.py file, INSTANCE_HOME
# will be adjusted to use it.
#
# If you set the ZEO_INSTANCE_HOME environment variable a ZEO setup
# is assumed, and you can attach to a running ZEO server (via the
# instance's custom_zodb.py).
#
##############################################################################
#
# The following code should be at the top of every test module:
#
# import os, sys
# if __name__ == '__main__':
# execfile(os.path.join(sys.path[0], 'framework.py'))
#
# ...and the following at the bottom:
#
# if __name__ == '__main__':
# framework()
#
##############################################################################
__version__ = '0.2.3'
# Save start state
#
__SOFTWARE_HOME = os.environ.get('SOFTWARE_HOME', '')
__INSTANCE_HOME = os.environ.get('INSTANCE_HOME', '')
if __SOFTWARE_HOME.endswith(os.sep):
__SOFTWARE_HOME = os.path.dirname(__SOFTWARE_HOME)
if __INSTANCE_HOME.endswith(os.sep):
__INSTANCE_HOME = os.path.dirname(__INSTANCE_HOME)
# Find and import the Testing package
#
if not sys.modules.has_key('Testing'):
p0 = sys.path[0]
if p0 and __name__ == '__main__':
os.chdir(p0)
p0 = ''
s = __SOFTWARE_HOME
p = d = s and s or os.getcwd()
while d:
if os.path.isdir(os.path.join(p, 'Testing')):
zope_home = os.path.dirname(os.path.dirname(p))
sys.path[:1] = [p0, p, zope_home]
break
p, d = s and ('','') or os.path.split(p)
else:
print 'Unable to locate Testing package.',
print 'You might need to set SOFTWARE_HOME.'
sys.exit(1)
import Testing, unittest
execfile(os.path.join(os.path.dirname(Testing.__file__), 'common.py'))
# Include ZopeTestCase support
#
if 1: # Create a new scope
p = os.path.join(os.path.dirname(Testing.__file__), 'ZopeTestCase')
if not os.path.isdir(p):
print 'Unable to locate ZopeTestCase package.',
print 'You might need to install ZopeTestCase.'
sys.exit(1)
ztc_common = 'ztc_common.py'
ztc_common_global = os.path.join(p, ztc_common)
f = 0
if os.path.exists(ztc_common_global):
execfile(ztc_common_global)
f = 1
if os.path.exists(ztc_common):
execfile(ztc_common)
f = 1
if not f:
print 'Unable to locate %s.' % ztc_common
sys.exit(1)
# Debug
#
print 'SOFTWARE_HOME: %s' % os.environ.get('SOFTWARE_HOME', 'Not set')
print 'INSTANCE_HOME: %s' % os.environ.get('INSTANCE_HOME', 'Not set')
sys.stdout.flush()
This is a test of the *reST* transform
o one
o two
o three
#
# Runs all tests in the current directory
#
# Execute like:
# python runalltests.py
#
# Alternatively use the testrunner:
# python /path/to/Zope/utilities/testrunner.py -qa
#
import os, sys
if __name__ == '__main__':
execfile(os.path.join(sys.path[0], 'framework.py'))
import unittest
TestRunner = unittest.TextTestRunner
suite = unittest.TestSuite()
tests = os.listdir(os.curdir)
tests = [n[:-3] for n in tests if n.startswith('test') and n.endswith('.py')]
for test in tests:
m = __import__(test)
if hasattr(m, 'test_suite'):
suite.addTest(m.test_suite())
if __name__ == '__main__':
TestRunner().run(suite)
#!/bin/bash
#
# example test runner shell script
#
# full path to the python interpretor
export PYTHON="/usr/local/bin/python2.3"
# path to ZOPE_HOME/lib/python
export SOFTWARE_HOME="/opt/zope/releases/Zope-2_7-branch/lib/python"
# path to your instance. Don't set it if you aren't having instance
export INSTANCE_HOME="/opt/zope/instances/plone21/"
${PYTHON} runalltests.py
import os, sys
if __name__ == '__main__':
execfile(os.path.join(sys.path[0], 'framework.py'))
from Testing import ZopeTestCase
from Products.Archetypes.tests.atsitetestcase import ATSiteTestCase
from Products.MimetypesRegistry.encoding import guess_encoding
class TestGuessEncoding(ATSiteTestCase):
def testUTF8(self):
e = guess_encoding('\xef\xbb\xbf any UTF-8 data')
self.failUnlessEqual(e, 'UTF-8')
e = guess_encoding(' any UTF-8 data \xef\xbb\xbf')
self.failUnlessEqual(e, None)
def testEmacs(self):
e = guess_encoding('# -*- coding: UTF-8 -*-')
self.failUnlessEqual(e, 'UTF-8')
e = guess_encoding('''
### -*- coding: ISO-8859-1 -*-
''')
self.failUnlessEqual(e, 'ISO-8859-1')
e = guess_encoding('''
### -*- coding: ISO-8859-1 -*-
''')
self.failUnlessEqual(e, None)
def testVim(self):
e = guess_encoding('# vim:fileencoding=UTF-8')
self.failUnlessEqual(e, 'UTF-8')
e = guess_encoding('''
### vim:fileencoding=ISO-8859-1
''')
self.failUnlessEqual(e, 'ISO-8859-1')
e = guess_encoding('''
### vim:fileencoding= ISO-8859-1
''')
self.failUnlessEqual(e, None)
def testXML(self):
e = guess_encoding('<?xml?>')
self.failUnlessEqual(e, 'UTF-8')
e = guess_encoding('''<?xml version="1.0" encoding="ISO-8859-1" ?>
''')
self.failUnlessEqual(e, 'ISO-8859-1')
e = guess_encoding('''<?xml version="1.0" encoding="ISO-8859-1"?>
''')
self.failUnlessEqual(e, 'ISO-8859-1')
e = guess_encoding('''<?xml version="1.0" encoding="ISO-8859-1"?><truc encoding="UTF-8">
</truc>
''')
self.failUnlessEqual(e, 'ISO-8859-1')
def testHTML(self):
e = guess_encoding('''<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<html>
<head>
<title>ASPN : Python Cookbook : Auto-detect XML encoding</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
<meta name="robots" content="all" />
<meta name="description" content="ActiveState Open Source Programming tools for Perl Python XML xslt scripting with free trials. Quality development tools for programmers systems administrators database administrators network administrators and webmasters" />
<meta name="keywords" content="ActiveState,Perl,xml,xslt,mozilla,Open Source,Python,Perl for Win32,resources,PerlScript,ActivePerl,Programming,Programmers,Integrated,Development,Environment,SOAP,Linux,Solaris,Web,development,tools,free,software,download,support,Perl Resource Kit,System Administration,Sys Admin,WinNT,SQL,Oracle,Email,XML,Linux,Programming,perl,NT,2000,windows,Unix,Software,Security, Administration,systems,windows,database,database,consulting,support,Microsoft,developer,resource,code,tutorials,IDE,Integrated development environment,developer,resources,tcl,php" />
<link rel="stylesheet" href="/ASPN/aspn.css" />
</head>
<body bgcolor="#FFFFFF" leftmargin="0" topmargin="0" marginwidth="0" marginheight="0">
charset=utf-8
</body>
</html> ''')
self.failUnlessEqual(e, 'iso-8859-1')
def test_broken_percent(self):
e = guess_encoding(
r"""<pre>
&lt;metal:block tal:define="dummy python:
request.RESPONSE.setHeader('Content-Type',
'text/html;;charset=%s' % charset)" /&gt;
&lt;metal:block tal:define="dummy
python:request.RESPONSE.setHeader('Content-Language', lang)"
/
&gt;
</pre>
"""
)
# unable to detect a valid encoding
self.failUnlessEqual(e, None)
def test_suite():
from unittest import TestSuite, makeSuite
suite = TestSuite()
suite.addTest(makeSuite(TestGuessEncoding))
return suite
if __name__ == '__main__':
framework()
import os, sys
if __name__ == '__main__':
execfile(os.path.join(sys.path[0], 'framework.py'))
from Testing import ZopeTestCase
from Products.Archetypes.tests.atsitetestcase import ATSiteTestCase
from Products.MimetypesRegistry.mime_types.magic import guessMime
from utils import input_file_path
samplefiles = [
('OOoWriter', 'application/vnd.sun.xml.writer'),
('OOoCalc', 'application/vnd.sun.xml.calc'),
('sxw-ooo-trolltech', 'application/vnd.sun.xml.writer'), # file from limi
('simplezip', 'application/zip'),
]
class TestGuessMagic(ATSiteTestCase):
def afterSetUp(self):
ATSiteTestCase.afterSetUp(self)
self.registry = self.portal.mimetypes_registry
def test_guessMime(self):
for filename, expected in samplefiles:
file = open(input_file_path(filename))
data = file.read()
file.close()
# use method direct
got = guessMime(data)
self.failUnlessEqual(got, expected)
# use mtr-tool
got_from_tool = self.registry.classify(data)
self.failUnlessEqual(got_from_tool, expected)
# now cut it to the first 8k if greater
if len(data) > 8192:
data=data[:8192]
got_cutted = self.registry.classify(data)
self.failUnlessEqual(got_cutted, expected)
def test_suite():
from unittest import TestSuite, makeSuite
suite = TestSuite()
suite.addTest(makeSuite(TestGuessMagic))
return suite
if __name__ == '__main__':
framework()
import os, sys
if __name__ == '__main__':
execfile(os.path.join(sys.path[0], 'framework.py'))
from Testing import ZopeTestCase
from Products.Archetypes.tests.atsitetestcase import ATSiteTestCase
from Products.MimetypesRegistry.mime_types import text_plain
from Products.MimetypesRegistry.mime_types import text_xml
from Products.MimetypesRegistry.mime_types import application_octet_stream
from utils import input_file_path
class TestMimeTypesclass(ATSiteTestCase):
def afterSetUp(self):
ATSiteTestCase.afterSetUp(self)
self.registry = self.portal.mimetypes_registry
def testClassify(self):
reg = self.registry
c = reg._classifiers()
self.failUnless(c[0].name().startswith("Extensible Markup Language"),
c[0].name())
#Real XML
data = "<?xml version='1.0'?><foo>bar</foo>"
mt = reg.classify(data)
self.failUnless(isinstance(mt, text_xml), str(mt))
# with leading whitespace (http://dev.plone.org/archetypes/ticket/622)
# still valid xml
data = " <?xml version='1.0'?><foo>bar</foo>"
mt = reg.classify(data)
self.failUnless(isinstance(mt, text_xml), str(mt))
# also #622: this is not xml
data = 'xml > plain text'
mt = reg.classify(data)
self.failUnless(str(mt) != 'text/xml')
#Passed in MT
mt = reg.classify(data, mimetype="text/plain")
self.failUnless(isinstance(mt, text_plain), str(mt))
#Passed in filename
mt = reg.classify(data, filename="test.xml")
self.failUnless(isinstance(mt, text_xml), str(mt))
mt = reg.classify(data, filename="test.jpg")
self.failUnlessEqual(str(mt), 'image/jpeg')
# use xml classifier
mt = reg.classify('<?xml ?>')
self.failUnless(isinstance(mt, text_xml), str(mt))
# test no data return default
mt = reg.classify('')
self.failUnless(isinstance(mt, text_plain), str(mt))
reg.defaultMimetype = 'text/xml'
mt = reg.classify('')
self.failUnless(isinstance(mt, text_xml), str(mt))
# test unclassifiable data and no stream flag (filename)
mt = reg.classify('xxx')
self.failUnless(isinstance(mt, text_plain), str(mt))
# test unclassifiable data and file flag
mt = reg.classify('baz', filename='xxx')
self.failUnless(isinstance(mt, application_octet_stream), str(mt))
def testExtension(self):
reg = self.registry
data = "<foo>bar</foo>"
mt = reg.lookupExtension(filename="test.xml")
self.failUnless(isinstance(mt, text_xml), str(mt))
mt = reg.classify(data, filename="test.foo")
self.failUnless(isinstance(mt, application_octet_stream), str(mt))
mt = reg.classify(data, filename="test.tgz")
self.failUnlessEqual(str(mt), 'application/x-tar')
mt = reg.classify(data, filename="test.tar.gz")
self.failUnlessEqual(str(mt), 'application/x-tar')
mt = reg.classify(data, filename="test.pdf.gz")
self.failUnlessEqual(str(mt), 'application/pdf')
def testFDOGlobs(self):
# The mime types here might only match if they match a glob on
# the freedesktop.org registry.
data = ''
reg = self.registry
mt = reg.classify(data, filename="test.anim1")
self.failUnlessEqual(str(mt), 'video/x-anim')
mt = reg.classify(data, filename="test.ini~")
self.failUnlessEqual(str(mt), 'application/x-trash')
mt = reg.classify(data, filename="test.ini%")
self.failUnlessEqual(str(mt), 'application/x-trash')
mt = reg.classify(data, filename="test.ini.bak")
self.failUnlessEqual(str(mt), 'application/x-trash')
mt = reg.classify(data, filename="test.f90")
self.failUnlessEqual(str(mt), 'text/x-fortran')
mt = reg.classify(data, filename="test.f95")
self.failUnlessEqual(str(mt), 'text/x-fortran')
mt = reg.classify(data, filename="makefile")
self.failUnlessEqual(str(mt), 'text/x-makefile')
mt = reg.classify(data, filename="Makefile")
self.failUnlessEqual(str(mt), 'text/x-makefile')
mt = reg.classify(data, filename="makefile.ac")
self.failUnlessEqual(str(mt), 'text/x-makefile')
mt = reg.classify(data, filename="makefile.in")
self.failUnlessEqual(str(mt), 'text/x-makefile')
mt = reg.classify(data, filename="AUTHORS")
self.failUnlessEqual(str(mt), 'text/x-authors')
mt = reg.classify(data, filename="INSTALL")
self.failUnlessEqual(str(mt), 'text/x-install')
def testLookup(self):
reg = self.registry
mt = reg.lookup('text/plain')
self.failUnless(isinstance(mt[0], text_plain), str(mt[0]))
# Test lookup of aliases in SMI database (see smi_mimetypes)
mt1 = reg.lookup('application/vnd.wordperfect')
mt2 = reg.lookup('application/wordperfect')
self.assertEqual(mt1, mt2)
mt = reg.lookup('text/notexistent')
self.failUnlessEqual(mt, ())
def testAdaptMt(self):
data, filename, mt = self.registry('bar', mimetype='text/xml')
# this test that data has been adaped and file seeked to 0
self.failUnlessEqual(data, 'bar')
self.failUnlessEqual(filename, None)
self.failUnless(isinstance(mt, text_xml), str(mt))
def testAdaptFile(self):
file = open(input_file_path("rest1.rst"))
data, filename, mt = self.registry(file)
# this test that data has been adaped and file seeked to 0
self.failUnlessEqual(data, file.read())
file.close()
self.failUnlessEqual(filename, "rest1.rst")
self.assertEqual(str(mt), 'text/x-rst')
def testAdaptData(self):
data, filename, mt = self.registry('<?xml ?>')
# this test that data has been adaped and file seeked to 0
self.failUnlessEqual(data, '<?xml ?>')
self.failUnlessEqual(filename, None)
self.failUnless(isinstance(mt, text_xml), str(mt))
def test_suite():
from unittest import TestSuite, makeSuite
suite = TestSuite()
suite.addTest(makeSuite(TestMimeTypesclass))
return suite
if __name__ == '__main__':
framework()
import re
import glob
from unittest import TestSuite
from sys import modules
from os.path import join, abspath, dirname, basename
def normalize_html(s):
s = re.sub(r"\s+", " ", s)
s = re.sub(r"(?s)\s+<", "<", s)
s = re.sub(r"(?s)>\s+", ">", s)
s = re.sub(r"\r", "", s)
return s
PREFIX = abspath(dirname(__file__))
def input_file_path(file):
return join(PREFIX, 'input', file)
def output_file_path(file):
return join(PREFIX, 'output', file)
def matching_inputs(pattern):
return [basename(path) for path in glob.glob(join(PREFIX, "input", pattern))]
def load(dotted_name, globals=None):
""" load a python module from it's name """
mod = __import__(dotted_name, globals)
components = dotted_name.split('.')
for comp in components[1:]:
mod = getattr(mod, comp)
return mod
1.4.0-final
\ No newline at end of file
<h1 tal:replace="structure here/manage_page_header|nothing">Header</h1>
<h2 tal:define="manage_tabs_message options/manage_tabs_message | nothing"
tal:replace="structure here/manage_tabs">Tabs</h2>
<form method="POST" action="manage_addMimeType"
tal:attributes="action string:${here/absolute_url}/manage_addMimeType;">
<div class="form-title">
Add a new mime type
</div>
<div tal:define="status python:request.get('portal_status', '')"
tal:condition="status"
class="error"
tal:content="status"
/>
<table width="50%">
<tr>
<td class="form-label">Name</td>
<td class="form-element">
<input name="id"
tal:attributes="value python:request.get('id', '');"/>
</td>
</tr><tr>
<td class="form-label">Icon path</td>
<td class="form-element">
<input name="icon_path"
tal:attributes="value python:request.get('icon_path', '');"/>
</td>
</tr><tr>
<td class="form-label">Binary?
</td>
<td class="form-element">
<input name="binary" type="checkbox"
tal:attributes="value python:request.get('binary', '');"/>
</td>
</tr><tr>
<td class="form-label">Mime-types
</td>
<td class="form-element">
<textarea name="mimetypes:list"
tal:content="python:request.get('mimetypes', '')"/>
</td>
</tr><tr>
<td class="form-label">Extensions
</td>
<td class="form-element">
<textarea name="extensions:list"
tal:content="python:request.get('extensions', '')"/>
</td>
</tr><tr>
<td class="form-label">Globs
</td>
<td class="form-element">
<textarea name="globs:list"
tal:content="python:request.get('globs', '')"/>
</td>
</tr>
</table>
<input type="submit" />
</form>
<tal:footer tal:replace="structure here/manage_page_footer|nothing">footer</tal:footer>
<h1 tal:replace="structure here/manage_page_header|nothing">Header</h1>
<h2 tal:define="manage_tabs_message options/manage_tabs_message | nothing"
tal:replace="structure here/manage_tabs">Tabs</h2>
<form method="POST" action=""
tal:attributes="action string:${here/absolute_url}/manage_editMimeType;"
tal:define="mt python:here.lookup(request.form.get('mt_name'))[0]">
<div class="form-title">
Edit mime type <b tal:content="mt/name"/>.
</div>
<div tal:define="status python:request.get('portal_status', '')"
tal:condition="status"
class="error"
tal:content="status"
/>
<input type="hidden" name="name" tal:attributes="value request/mt_name"/>
<table width="50%">
<tr>
<td>Name</td>
<td>
<input name="new_name"
tal:attributes="value python:request.get('new_name', mt.name());"/>
</td>
</tr><tr>
<td>Icon path</td>
<td>
<input name="icon_path"
tal:attributes="value python:request.get('icon_path', mt.icon_path);"/>
</td>
</tr><tr>
<td>Binary?
</td>
<td>
<input name="binary" type="checkbox"
tal:attributes="checked python:request.get('binary', mt.binary) and 1 or 0;"/>
</td>
</tr><tr>
<td>Mime-types
</td>
<td>
<textarea name="mimetypes"
tal:content="python:request.get('mimetypes', '\n'.join(mt.mimetypes))"/>
</td>
</tr><tr>
<td>Extensions
</td>
<td>
<textarea name="extensions"
tal:content="python:request.get('extensions', '\n'.join(mt.extensions))"/>
</td>
</tr><tr>
<td>Globs
</td>
<td>
<textarea name="globs"
tal:content="python:request.get('globs', '\n'.join(mt.globs))"/>
</td>
</tr>
</table>
<input type="submit"/>
</form>
<tal:footer tal:replace="structure here/manage_page_footer|nothing">footer</tal:footer>
<h1 tal:replace="structure here/manage_page_header|nothing">Header</h1>
<h2 tal:define="manage_tabs_message options/manage_tabs_message | nothing"
tal:replace="structure here/manage_tabs">Tabs</h2>
<tal:block tal:define="mimetypes here/list_mimetypes">
<div class="form-title">
Registered MIME types (<span tal:replace="python:len(mimetypes)"/>).
</div>
<div align="right">
<form method="POST" action="manage_addMimeTypeForm">
<input type="submit" value="Add a new MIME type"/>
</form>
</div>
<div tal:define="status python:request.get('portal_status', '')"
tal:condition="status" class="error"
tal:content="status" />
<form method="POST" action="manage_delObjects"
tal:define="dummy mimetypes/sort">
<table width="90%">
<tr class="form-label">
<th colspan="3">Name</th>
<th>Mime-types</th>
<th>Extensions</th>
<th>Globs</th>
<th>Binary?</th>
</tr>
<tr class="form-element" tal:repeat="mt_id mimetypes">
<tal:block tal:define="mt python:here.lookup(mt_id)[0];">
<td>
<input type="checkbox" name="ids:list"
tal:attributes="value mt/normalized"/>
</td>
<td>
<img tal:attributes="src string:${here/portal_url}/${mt/icon_path}"/>
</td>
<td>
<a tal:content="mt/name"
tal:attributes="href string:${here/absolute_url}/manage_editMimeTypeForm?mt_name=${mt/normalized}"/>
</td>
<td tal:content="python:', '.join(mt.mimetypes)"/>
<td tal:content="python:', '.join(mt.extensions)"/>
<td tal:content="python:', '.join(mt.globs)"/>
<td tal:content="python: mt.binary and 'yes' or 'no'"/>
</tal:block>
</tr>
</table>
<input type="submit" value="Delete Selected Items" />
</form>
</tal:block>
<tal:footer tal:replace="structure here/manage_page_footer|nothing">footer</tal:footer>
DONT USE ChangeLog USE HISTORY.txt instead.
2004-07-24 Christian Heimes <heimes@faho.rwth-aachen.de>
* Changed version to stick to Archetypes version.
2004-05-25 Christian Heimes <heimes@faho.rwth-aachen.de>
* Seperate MimetypesRegistry to a new product
2004-04-20 Christian Heimes <heimes@faho.rwth-aachen.de>
* transforms/rest.py: rest transform is now using the zope implementation if
available
2004-04-07 Christian Heimes <heimes@faho.rwth-aachen.de>
* transforms/text_pre_to_html.py: new transform for preformatted plain text
* transforms/text_to_html.py: changed <br/> to <br />
2004-03-17 Christian Heimes <heimes@faho.rwth-aachen.de>
* transforms/pdf_to_text.py: return text utf-8 encoded
2004-02-04 Sylvain Thénault <syt@logilab.fr>
* transforms/office_com.py: fix wrong import
2003-12-03 Sidnei da Silva <sidnei@awkly.org>
* mime_types/magic.py (guessMime): Don't try to be so magic :)
2003-11-18 Andreas Jung <andreas@andreas-jung.com)
* commandtransform.py: fixed sick cleanDir() implementation
2003-11-17 Andreas Jung <andreas@andreas-jung.com)
* added rtf_to_html.py converter
* added rtf to as mimetypes to mime_types/__init__.py
* added rtf_to_xml.py converter
* added pdf_to_text.py converter
* removed dependency from CMFDefault.utils for misc converters
(integrated code into libtransforms/utils.py)
2003-11-14 Sidnei da Silva <sidnei@plone.org>
* MimeTypesRegistry.py (MimeTypesRegistry.classify): If no results
this far, use magic.py module, written by Jason Petrone, and
updated by Gabriel Wicke with the data from gnome-vfs-mime-magic.
2003-11-07 Sylvain Thénault <syt@logilab.fr>
* use the same license as Archetypes (BSD like instead of GPL)
* www/tr_widgets.zpt: fix bug in the list widget (space before the
parameter's name, making it unrecognized)
* zope/Transform.py: fix set parameters to correctly remap
transform if editable inputs or output. (fix #837244)
* TransformEngine.py: better error messages, a few lines wrapping
* zope/__init__.py: use pt_globals instead of globals for variable
handling the product globals, making it reloadable
* Extensions/Install.py: use pt_globals
* www/listMimeTypes.zpt: use mt/normalized as id instead of mt/name
2003-11-05 Sylvain Thénault <syt@logilab.fr>
* unsafe_tranforms/command.py: added dummy output mime type to avoid
error when added via the ZMI (fix #837252)
2003-10-30 Sylvain Thénault <syt@logilab.fr>
* fixed addMimeType, editMimeType and tr_widget templates (fix #832958)
2003-10-03 Sidnei da Silva <sidnei@dreamcatcher.homeunix.org>
* utils.py (TransformException.getToolByName): Modified
getToolByName to have a fallback mimetypes_registry, so we can
simplify BaseUnit.
2003-09-23 Sylvain Thénault <syt@logilab.fr>
* MimesTypesRegistry.py: make unicode error handling configurable
* zope/MimesTypesTool.py: add a property for unicode error handling
* zope/Transform.py: make tests working
2003-08-19 Sylvain Thénault <syt@logilab.fr>
* transforms/rest.py: override "traceback" setting to avoid
sys.exit !
* transforms/text_to_html.py: use html_quote
2003-08-12 Sylvain Thénault <syt@logilab.fr>
* TransformEngine.py: set "encoding" in datastream metadata if
tranform provides a "output_encoding" attribute. Fix access to
"id" instead of "name()"
* zope/Transform.py: add some code to handle output encoding...
2003-08-08 Sylvain Thénault <syt@logilab.fr>
* MimeTypesRegistry.py: use suffix map has the standard mime types
module, hopefully correct behaviour of classify
* unsafe_transforms/build_transforms.py: fix inputs and output
mime type of ps_to_text transform
2003-08-07 Sylvain Thenault <sylvain.thenault@logilab.fr>
* encoding.py: new module which aims to detect encoding of text
files
* MimeTypesRegistry.py: use the encoding module in iadapter
2003-08-06 Sylvain Thenault <sylvain.thenault@logilab.fr>
* MimeTypesRegistry.py (classify): return
'application/octet-stream' instead of None
* transforms/text_to_html.py: replace '\n' with <br/> instead of
<pre> wrapping
* unsafe_transforms/build_transforms.py: create a ps_to_text
transform if ps2ascii is available
* tests/test_transforms.py: handle name of transforms to test on
command line
* transforms/__init__.py: do not print traceback on missing binary
exception
2003-08-01 Sylvain Thenault <sylvain.thenault@logilab.fr>
* transforms/text_to_html.py: new transform to wrap plain text in
<pre> for html
* transforms/test_transforms.py: add test for text_to_html
2003-07-28 Sylvain Thenault <sylvain.thenault@logilab.fr>
* zope/TransformsChain.py: fixes to make it works within Zope.
* www/editTransformsChain.zpt: add inputs / output information.
2003-07-28 Sylvain Thenault <sylvain.thenault@logilab.fr>
* transforms/rest.py: remove class="document"
* tests/test_transforms.py: added missing output for the identity
transform's test, fix initialize method.
2003-07-21 Sylvain Thenault <sylvain.thenault@logilab.fr>
* transforms/identity.py: added identity transform (used for instance
to convert text/x-rest to text/plain).
* tests/test_transforms.py: added test for the identity transform.
2003-07-11 Sylvain Thenault <sylvain.thenault@logilab.fr>
* unsafe_transforms/xml.py: just make it working.
* unsafe_transforms/command.py: add missing "name" argument to the
constructor. Use popen3 instead of popen4.
* unsafe_transforms/build_transforms.py: create an xml_to_html
transform if an xslt processor is available (however this transform
is not configured for any doctypes / dtds). Create tidy_html
transform if the tidy command is available.
* tests/test_transforms.py: add test cases for the xml and
html_tidy transform.
* transform.py: added transform_customize hook.
* docs/user_manual.rst: explain difference between python distro
and zope product. Added notice about archetypes integration.
* docs/dev_manual.rst: minor fixes.
003-07-10 Sylvain Thenault <sylvain.thenault@logilab.fr>
* refactoring to permit use of this package outside zope :)
Zope mode is triggered when "import Zope" doesn't fail
* fix bug in word_to_html / office_wvware transform
* add a generic test for transforms. It's much more easier now to
add a test for a transform :)
* add licensing information
* interfaces.py: complete / cleanup interfaces
* bin/tranform: add command line tool
* unsafe_transforms/command.py: bug fix
* addTransformsChain.zpt: fix typo
* fix #768927
2003-07-09 Sylvain Thenault <sylvain.thenault@logilab.fr>
* code cleaning:
- moved Transform and TransformsChain in their own files
- removed no more used bindingmixin and sourceAdapter
- merged transform and chain classes together
- generic cache and misc utilities in the new utils.py.
* ready for 1.0 alpha1 :)
2003-07-05 Sylvain Thenault <sylvain.thenault@logilab.fr>
* make the PortalTransforms product from the original transform
package and the mimetypes / transforms tools originaly defined in
Archetypes.
* drop the ability to use it as a standalone python package, since
there was too much duplicated code to make it works.
* some works on tests to make them succeed :)
* MimeTypesTool.py (MimeTypesTool.lookup): return an empty list
instead of None when no matching mime types is found.
2003-05-14 Sidnei da Silva <sidnei@x3ng.com>
* interface.py: Trying to normalize the way interfaces are
imported in different versions of Zope.
2003-04-21 Sidnei da Silva <sidnei@x3ng.com>
* __init__.py: Fixed lots of things here and there to make it work
with the new BaseUnit in Archetypes.
2003-04-20 Sidnei da Silva <sidnei@x3ng.com>
* tests/output/rest3.out: Fixed subtitle and added a test.
2003-04-19 Sidnei da Silva <sidnei@x3ng.com>
* tests/test_rest.py (BaseReSTFileTest.testSame): Added tests
based on input/output dirs to make it easy to add new tests for reST.
* transforms/rest.py (rest.convert): Rendering of
reST was broken. It was not rendering references the right way,
and it didnt seem like it was doing the right thing with
titles. Updated to use docutils.core.publish_string.
* tests/test_all.py (test_suite): Added lynx_dump to transform
html -> text. With tests.
2003-04-18 Sidnei da Silva <sidnei@x3ng.com>
* tests/test_all.py (test_suite): Removed dependencies from
CMFCore on testsuite.
* __init__.py: Made it work without being inside Products. We
eventually need to make a distutils setup, and then this can be
removed. If someone knows a better way to do this, please do.
lynx
pdftohtml
python-docutils
import os
from Products.CMFCore.DirectoryView import addDirectoryViews
from Products.CMFCore.DirectoryView import registerDirectory
from Products.CMFCore.DirectoryView import createDirectoryView
from Products.CMFCore.DirectoryView import manage_listAvailableDirectories
from Products.CMFCore.utils import getToolByName
from Products.CMFCore.utils import minimalpath
from Globals import package_home
from Acquisition import aq_base
from OFS.ObjectManager import BadRequestException
from Products.PortalTransforms import GLOBALS
from Products.PortalTransforms import skins_dir
from StringIO import StringIO
def install(self):
out = StringIO()
qi=getToolByName(self, 'portal_quickinstaller')
qi.installProduct('MimetypesRegistry',)
id = 'portal_transforms'
if hasattr(aq_base(self), id):
pt = getattr(self, id)
if not getattr(aq_base(pt), '_new_style_pt', None) == 1:
print >>out, 'Removing old portal transforms tool'
self.manage_delObjects([id,])
if not hasattr(aq_base(self), id):
addTool = self.manage_addProduct['PortalTransforms'].manage_addTool
addTool('Portal Transforms')
print >>out, 'Installing portal transforms tool'
updateSafeHtml(self, out)
correctMapping(self, out)
# not required right now
# installSkin(self)
return out.getvalue()
def correctMapping(self, out):
pt = getToolByName(self, 'portal_transforms')
pt_ids = pt.objectIds()
for m_in, m_out_dict in pt._mtmap.items():
for m_out, transforms in m_out_dict.items():
for transform in transforms:
if transform.id not in pt_ids:
#error, mapped transform is no object in portal_transforms. correct it!
print >>out, "have to unmap transform (%s) cause its not in portal_transforms ..." % transform.id
try:
pt._unmapTransform(transform)
except:
raise
else:
print >>out, "...ok"
def updateSafeHtml(self, out):
print >>out, 'Update safe_html...'
safe_html_id = 'safe_html'
safe_html_module = "Products.PortalTransforms.transforms.safe_html"
pt = getToolByName(self, 'portal_transforms')
for id in pt.objectIds():
transform = getattr(pt, id)
if transform.id == safe_html_id and transform.module == safe_html_module:
try:
disable_transform = transform.get_parameter_value('disable_transform')
except KeyError:
print >>out, ' replace safe_html (%s, %s) ...' % (transform.name(), transform.module)
try:
pt.unregisterTransform(id)
pt.manage_addTransform(id, safe_html_module)
except:
raise
else:
print >>out, ' ...done'
print >>out, '...done'
def installSkin(self):
skinstool=getToolByName(self, 'portal_skins')
fullProductSkinsPath = os.path.join(package_home(GLOBALS), skins_dir)
productSkinsPath = minimalpath(fullProductSkinsPath)
registered_directories = manage_listAvailableDirectories()
if productSkinsPath not in registered_directories:
registerDirectory(skins_dir, GLOBALS)
try:
addDirectoryViews(skinstool, skins_dir, GLOBALS)
except BadRequestException, e:
pass # directory view has already been added
files = os.listdir(fullProductSkinsPath)
for productSkinName in files:
if os.path.isdir(os.path.join(fullProductSkinsPath, productSkinName)) \
and productSkinName != 'CVS':
for skinName in skinstool.getSkinSelections():
path = skinstool.getSkinPath(skinName)
path = [i.strip() for i in path.split(',')]
try:
if productSkinName not in path:
path.insert(path.index('custom') +1, productSkinName)
except ValueError:
if productSkinName not in path:
path.append(productSkinName)
path = ','.join(path)
skinstool.addSkinSelection(skinName, path)
1.4.0-final - 2006-06-16
========================
* Shut down a noisy logging message to DEBUG level.
[hannosch]
* Converted logging infrastructure from zLOG usage to Python's logging module.
[hannosch]
* Avoid DeprecationWarning for manageAddDelete.
[hannosch]
* Spring-cleaning of tests infrastructure.
[hannosch]
1.4.0-beta1 - 2006-03-26
========================
* removed odd archetypes 1.3 style version checking
[jensens]
* Removed BBB code for CMFCorePermissions import location.
[hannosch]
* removed deprecation-warning for ToolInit
[jensens]
1.3.9-final02 - 2006-01-15
==========================
* nothing - the odd version checking needs a version change to stick to
Archetypes version.
[yenzenz]
1.3.9-RC1 - 2005-12-29
======================
* Fixed [ 1293684 ], unregistered Transforms are not unmaped,
Transformation was deleted from portal_transforms, but remained
active.
http://sourceforge.net/tracker/index.php?func=detail&aid=1293684&group_id=75272&atid=543430
Added a cleanup that unmaps deleted transforms on reinstall
[csenger]
* Replaced the safe_html transformation with a configurable version
with the same functionality. Migration is handled on reinstall.
http://trac.plone.org/plone/ticket/4538
[csenger] [dreamcatcher]
* Removed CoUnInitialize call. According to Mark Hammond: The
right thing to do is call that function, although almost noone
does (including pywin32 itself, which does CoInitialize the main
thread) and I've never heard of problem caused by this
omission.
[sidnei]
* Fix a long outstanding issue with improper COM thread model
initialization. Initialize COM for multi-threading, ignoring any
errors when someone else has already initialized differently.
https://trac.plone.org/plone/ticket/4712
[sidnei]
* Correct some wrong security settings.
[hannosch]
* Fixed the requirements look-up from the policy
(#1358085)
1.3.8-final02 - 2005-10-11
==========================
* nothing - the odd version checking needs a version change to stick to
Archetypes version.
[yenzenz]
1.3.7-final01 - 2005-08-30
==========================
* nothing - the odd version checking needs a version change to stick to
Archetypes version.
[yenzenz]
1.3.6-final02 - 2005-08-07
==========================
* nothing - the odd version checking needs a version change to stick to
Archetypes version.
[yenzenz]
1.3.6-final - 2005-08-01
========================
* Added q to the list of valid and safe html tags by limi's request.
Wrote test for safe_html parsing.
[hannosch]
* Added ins and del to the list of valid and safe html tags.
[ 1199917 ] XHTML DEL tag is removed during the safe_html conversion
[tiran]
1.3.5-final02 - 2005-07-17
==========================
* changed version to stick to appropiate Archetypes Version.
[yenzenz]
1.3.5-final - 2005-07-06
========================
* pdf_to_html can show images now. Revert it to command transformer and
make it work under windows.
[panjunyong]
* refined command based unsafe transform to make it work with windows.
[panjunyong]
* Disabled office_uno by default because it doesn't support multithread yet
[panjunyong]
* Rewrote office_uno to make it work for the recent PyUNO.
[panjunyong]
1.3.4-final01 - 2005-05-20
==========================
* nothing (I hate to write this. But the odd version checking needs it).
[yenzenz]
1.3.4-rc1 - 2005-03-25
======================
* Better error handling for safe html transformation
[tiran]
1.3.3-final - 2005-03-05
========================
* Updated link to rtf converter to http://freshmeat.net/projects/rtfconverter/
[tiran]
* Small fix for the com office converter. COM could crash if word is
invisible. Also a pop up might appeare when quitting word.
[gogo]
* Fixed [ 1053846 ] Charset problem with wvware word_to_html conversion
[flacoste]
* Fixed python and test pre transforms to use html quote special characters.
Thx to stain. [ 1091670 ] Python source code does not escape HTML.
[tiran]
* Fixed [ 1121812 ] fix PortalTransforms unregisterTransformation()
unregisterTransformation() misses to remove from the zodb the persistance
wrapper added to the trasformation
[dan_t]
* Fixed [ 1118739 ] popentransform does not work on windows
[duncanb]
* Fixed [ 1122175 ] extra indnt sytax error in office_uno.py
[ryuuguu]
* fixed bug with some transformers' temp filename: it tried to use original filename
which is encoded in utf8 and may contrain invalid charset for my Windows server.
Just use filename as: unknown.suffix
[panjunyong]
* STX header level is set to 2 instead of using zope.conf. Limi forced me to
change it.
[tiran]
* fixed bug: word_to_html uses office_com under windows
1.3.2-5 - 2004-10-17
====================
* Fixed [ 1041637 ] RichWidget: STX level should be set to 3 instead 1. The
structured text transform is now using the zope.conf option or has an
optional level paramenter in the convert method.
[tiran]
* Added win32api.GetShortPathName to libtransforms/commandtransform
so binaries found in directories which have spaces in their names
will work as expected
[runyaga]
1.3.2-4 - 2004-09-30
====================
* nothing changed
1.3.2-3 - 2004-09-25
====================
* Fixed more unit tests
[tiran]
1.3.2-2 - 2004-09-17
====================
* Fixed [ 1025066 ] Serious persistency bug
[dmaurer]
* Fixed some unit tests failurs. Some unit tests did fail because the reST
and STX output has changed slightly.
[tiran]
* Don't include the first three lines of the lynx output which are url,
title and a blank line. This fixed also a unit test because the url
which was a file in the fs did change every time.
[tiran]
* Fixed a bug in make_unpersistent. It seemed that this method touched values
inside the mapping.
[dreamcatcher]
1.3.2-1 - 2004-09-04
====================
* Disabled filters that were introduced in 1.3.1-1. The currently used
transform path algo is broken took too long to find a path.
[tiran]
* Cleaned up major parts of PT by removing the python only implementation which
was broken anyway
* Fixed [ 1019632 ] current svn bundle (rev 2942) broken
1.3.1-1 - 2004-08-16
====================
* Introduce the concept of filters (one-hop transforms where the source and
destination are the same mimetype).
[dreamcatcher]
* Add a html filter to extract the content of the body tag (so we don't get a
double <body> when uploading full html files).
[dreamcatcher]
* Change base class for Transform to SimpleItem which is equivalent to the
previous base classes and provides a nice __repr__.
[dreamcatcher]
* Lower log levels.
[dreamcatcher]
* cache.py: Added purgeCache, fixed has cache test.
[tiran]
* Fixed non critical typo in error message: Unvalid -> Invalid
[tirna]
1.3.0-3 - 2004-08-06
====================
* Added context to the convert, convertTo and __call__ methods. The context is
the object on which the transform was called.
[tiran]
* Added isCacheable flag and setCacheable to idatastream (data.py). Now you can
disable the caching of the result of a transformation.
[tiran]
* Added __setstate__ to load new transformations from the file system.
[tiran]
* Fixed [ 1002014 ] Add policy screen doesn't accept single entry
[tiran]
1.3.0-2 - 2004-07-29
====================
* Added workaround for [ 997998 ] PT breaks ZMI/Find [tiran]
Copyright (c) 2002-2003, Benjamin Saller <bcsaller@ideasuite.com>, and
the respective authors.
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are
met:
* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above
copyright notice, this list of conditions and the following disclaimer
in the documentation and/or other materials provided with the
distribution.
* Neither the name of Archetypes nor the names of its contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
include ChangeLog
include README
include TODO
include version.txt
include bin/transform
include bin/transform.bat
recursive-include docs *.txt
recursive-include docs *.rst
recursive-include docs *.html
recursive-include tests/input *
recursive-include tests/output *
NAME=PortalTransforms
MAJOR_VERSION=1.0
MINOR_VERSION=4
RELEASE_TAG=
PACKAGE_NAME=${NAME}-${MAJOR_VERSION}.${MINOR_VERSION}${RELEASE_TAG}
PYTHON="/usr/bin/python"
TMPDIR=~/tmp
CURDIR=~/src/archetypes/head/PortalTransforms
BASE_DIR=${CURDIR}/..
SOFTWARE_HOME=~/src/zope/2_7/lib/python
INSTANCE_HOME=~/src/instance/shellex
PACKAGES=PortalTransforms
RM=rm -f
RMRF=rm -rf
FIND=find
XARGS=xargs
CD=cd
LN=ln -sfn
CP=cp
TAR=tar
MKDIR=mkdir -p
.PHONY : clean test reindent reindent_clean sdist
.PHONY : default
# default: The default step (invoked when make is called without a target)
default: clean test
clean :
find . \( -name '*~' -o -name '*.py[co]' -o -name '*.bak' \) -exec rm {} \; -print
reindent :
~/src/reindent.py -r -v .
test :
export INSTANCE_HOME=${INSTANCE_HOME}; export SOFTWARE_HOME=${SOFTWARE_HOME}; \
cd ${CURDIR}/tests && ${PYTHON} runalltests.py
# sdist: Create a source distribution file (implies clean).
#
sdist: reindent clean sdist_tgz
# sdist_tgz: Create a tgz archive file as a source distribution.
#
sdist_tgz:
echo -n "${MAJOR_VERSION}.${MINOR_VERSION}${RELEASE_TAG}" >\
${CURDIR}/version.txt
${MKDIR} ${TMPDIR}/${PACKAGE_NAME}
${CD} ${TMPDIR}/${PACKAGE_NAME} && \
for package in ${PACKAGES}; do ${LN} ${BASE_DIR}/$$package .; done && \
${CD} ${TMPDIR} && ${TAR} czfh ${BASE_DIR}/${PACKAGE_NAME}.tgz ${PACKAGE_NAME} \
--exclude=${PACKAGE_NAME}.tgz\
--exclude=CVS \
--exclude=.cvsignore \
--exclude=makefile \
--exclude=Makefile \
--exclude=*.pyc \
--exclude=TAGS \
--exclude=*~ \
--exclude=.#*
${RMRF} ${TMPDIR}/${PACKAGE_NAME}
Portal Transforms
=================
This Zope product provides two new tools for the CMF in order to make MIME
types based transformations on the portal contents, and so an easy to way to
plugin some new transformations for previously unsupported content types. The
provided tools are :
* portal_transform (the transform tool) : handle transformation of data from a
mime type to another
A bunch of ready to use transformations are also provided. Look at the
documentation for more information.
Notice this package can also be used as a standalone Python package. If
you've downloaded the Python distribution, you can't make it a Zope
product since Zope files have been removed from this distribution.
This product is an off-spring of the Archetypes project.
Installation
------------
WARNING : The two installation methods may conflict, choose the one adapted to
your need.
Zope
````
* Put this package in your Zope's Products directory and restart Zope
* either use the QuickInstaller to add this product to your CMF site or add an
external method to the root of your CMF site with the following information :
:module: PortalTransforms.Install
:method: install
and click the test tab to run it.
Python
``````
* Extract the tarball
* Run "python setup.py install". See "python setup.py install --help" for
installation options.
* That's it, you should have the library and the *transform* command line tool
installed.
Documentation
-------------
See the *docs* directory in this package.
Mailing-list
------------
Discussion about this products occurs to the archetypes mailing list :
http://sourceforge.net/mail/?group_id=75272
or on the #plone channel of irc.freenode.net.
Authors
-------
Benjamin Saller <bcsaller@yahoo.com>
Sidnei da Silva <sidnei@x3ng.com>
Sylvain Thénault <sylvain.thenault@logilab.fr>
wv
xsltproc
tidy
unrtf
ppthtml
xlhtml
gs-common
TODO list for the Portal Transforms product
-------------------------------------------
* enhance unsafe_transforms/build_transforms to provide a bunch of
transformations using command/xml with various configuration
* iencoding_classifier ?
* make more transforms :)
from logging import ERROR
from UserDict import UserDict
from Products.PageTemplates.PageTemplateFile import PageTemplateFile
from Globals import InitializeClass
from Globals import PersistentMapping
try:
from ZODB.PersistentList import PersistentList
except ImportError:
from persistent.list import PersistentList
from OFS.SimpleItem import SimpleItem
from AccessControl import ClassSecurityInfo
from Products.CMFCore.permissions import ManagePortal
from Products.CMFCore.utils import getToolByName
from Products.PortalTransforms.interfaces import itransform
from Products.PortalTransforms.utils import TransformException, log, _www
from Products.PortalTransforms.transforms.broken import BrokenTransform
__revision__ = '$Id: Transform.py 6255 2006-04-11 15:29:29Z hannosch $'
def import_from_name(module_name):
""" import and return a module by its name """
__traceback_info__ = (module_name, )
m = __import__(module_name)
try:
for sub in module_name.split('.')[1:]:
m = getattr(m, sub)
except AttributeError, e:
raise ImportError(str(e))
return m
def make_config_persistent(kwargs):
""" iterates on the given dictionnary and replace list by persistent list,
dictionary by persistent mapping.
"""
for key, value in kwargs.items():
if type(value) == type({}):
p_value = PersistentMapping(value)
kwargs[key] = p_value
elif type(value) in (type(()), type([])):
p_value = PersistentList(value)
kwargs[key] = p_value
def make_config_nonpersistent(kwargs):
""" iterates on the given dictionary and replace ListClass by python List,
and DictClass by python Dict
"""
for key, value in kwargs.items():
if isinstance(value, PersistentMapping):
p_value = dict(value)
kwargs[key] = p_value
elif isinstance(value, PersistentList):
p_value = list(value)
kwargs[key] = p_value
VALIDATORS = {
'int' : int,
'string' : str,
'list' : PersistentList,
'dict' : PersistentMapping,
}
class Transform(SimpleItem):
"""A transform is an external method with
additional configuration information
"""
__implements__ = itransform
meta_type = 'Transform'
meta_types = all_meta_types = ()
manage_options = (
({'label':'Configure',
'action':'manage_main'},
{'label':'Reload',
'action':'manage_reloadTransform'},) +
SimpleItem.manage_options
)
manage_main = PageTemplateFile('configureTransform', _www)
manage_reloadTransform = PageTemplateFile('reloadTransform', _www)
tr_widgets = PageTemplateFile('tr_widgets', _www)
security = ClassSecurityInfo()
__allow_access_to_unprotected_subobjects__ = 1
def __init__(self, id, module, transform=None):
self.id = id
self.module = module
# DM 2004-09-09: 'Transform' instances are stored as
# part of a module level configuration structure
# Therefore, they must not contain persistent objects
self._config = UserDict()
self._config.__allow_access_to_unprotected_subobjects__ = 1
self._config_metadata = UserDict()
self._tr_init(1, transform)
def __setstate__(self, state):
""" __setstate__ is called whenever the instance is loaded
from the ZODB, like when Zope is restarted.
We should reload the wrapped transform at this time
"""
Transform.inheritedAttribute('__setstate__')(self, state)
self._tr_init()
def _tr_init(self, set_conf=0, transform=None):
""" initialize the zope transform by loading the wrapped transform """
__traceback_info__ = (self.module, )
if transform is None:
transform = self._load_transform()
else:
self._v_transform = transform
# check this is a valid transform
if not hasattr(transform, '__class__'):
raise TransformException('Invalid transform : transform is not a class')
if not itransform.isImplementedBy(transform):
raise TransformException('Invalid transform : itransform is not implemented by %s' % transform.__class__)
if not hasattr(transform, 'inputs'):
raise TransformException('Invalid transform : missing required "inputs" attribute')
if not hasattr(transform, 'output'):
raise TransformException('Invalid transform : missing required "output" attribute')
# manage configuration
if set_conf and hasattr(transform, 'config'):
conf = dict(transform.config)
self._config.update(conf)
make_config_persistent(self._config)
if hasattr(transform, 'config_metadata'):
conf = dict(transform.config_metadata)
self._config_metadata.update(conf)
make_config_persistent(self._config_metadata)
transform.config = dict(self._config)
make_config_nonpersistent(transform.config)
transform.config_metadata = dict(self._config_metadata)
make_config_nonpersistent(transform.config_metadata)
self.inputs = transform.inputs
self.output = transform.output
self.output_encoding = getattr(transform, 'output_encoding', None)
return transform
def _load_transform(self):
m = import_from_name(self.module)
if not hasattr(m, 'register'):
msg = 'Invalid transform module %s: no register function defined' % self.module
raise TransformException(msg)
try:
transform = m.register()
except Exception, err:
transform = BrokenTransform(self.id, self.module, err)
msg = "Cannot register transform %s, using BrokenTransform: Error\n %s" % (self.id, err)
self.title = 'BROKEN'
log(msg, severity=ERROR)
else:
self.title = ''
self._v_transform = transform
return transform
security.declarePrivate('manage_beforeDelete')
def manage_beforeDelete(self, item, container):
SimpleItem.manage_beforeDelete(self, item, container)
if self is item:
# unregister self from catalog on deletion
tr_tool = getToolByName(self, 'portal_transforms')
tr_tool._unmapTransform(self)
security.declarePublic('get_documentation')
def get_documentation(self):
""" return transform documentation """
if not hasattr(self, '_v_transform'):
self._load_transform()
return self._v_transform.__doc__
security.declarePublic('get_documentation')
def convert(self, *args, **kwargs):
""" return apply the transform and return the result """
if not hasattr(self, '_v_transform'):
self._load_transform()
return self._v_transform.convert(*args, **kwargs)
security.declarePublic('name')
def name(self):
"""return the name of the transform instance"""
return self.id
security.declareProtected(ManagePortal, 'get_parameters')
def get_parameters(self):
""" get transform's parameters names """
if not hasattr(self, '_v_transform'):
self._load_transform()
keys = self._v_transform.config.keys()
keys.sort()
return keys
security.declareProtected(ManagePortal, 'get_parameter_value')
def get_parameter_value(self, key):
""" get value of a transform's parameter """
value = self._config[key]
type = self.get_parameter_infos(key)[0]
if type == 'dict':
result = {}
for key, val in value.items():
result[key] = val
elif type == 'list':
result = list(value)
else:
result = value
return result
security.declareProtected(ManagePortal, 'get_parameter_infos')
def get_parameter_infos(self, key):
""" get informations about a parameter
return a tuple (type, label, description [, type specific data])
where type in (string, int, list, dict)
label and description are two string describing the field
there may be some additional elements specific to the type :
(key label, value label) for the dict type
"""
try:
return tuple(self._config_metadata[key])
except KeyError:
return 'string', '', ''
security.declareProtected(ManagePortal, 'set_parameters')
def set_parameters(self, REQUEST=None, **kwargs):
""" set transform's parameters """
if not kwargs:
kwargs = REQUEST.form
self.preprocess_param(kwargs)
for param, value in kwargs.items():
try:
self.get_parameter_value(param)
except KeyError:
log('Warning: ignored parameter %r' % param)
continue
meta = self.get_parameter_infos(param)
self._config[param] = VALIDATORS[meta[0]](value)
tr_tool = getToolByName(self, 'portal_transforms')
# need to remap transform if necessary (i.e. configurable inputs / output)
if kwargs.has_key('inputs') or kwargs.has_key('output'):
tr_tool._unmapTransform(self)
if not hasattr(self, '_v_transform'):
self._load_transform()
self.inputs = kwargs.get('inputs', self._v_transform.inputs)
self.output = kwargs.get('output', self._v_transform.output)
tr_tool._mapTransform(self)
# track output encoding
if kwargs.has_key('output_encoding'):
self.output_encoding = kwargs['output_encoding']
if REQUEST is not None:
REQUEST['RESPONSE'].redirect(tr_tool.absolute_url()+'/manage_main')
security.declareProtected(ManagePortal, 'reload')
def reload(self):
""" reload the module where the transformation class is defined """
log('Reloading transform %s' % self.module)
m = import_from_name(self.module)
reload(m)
self._tr_init()
def preprocess_param(self, kwargs):
""" preprocess param fetched from an http post to handle optional dictionary
"""
for param in self.get_parameters():
if self.get_parameter_infos(param)[0] == 'dict':
try:
keys = kwargs[param + '_key']
del kwargs[param + '_key']
except:
keys = ()
try:
values = kwargs[param + '_value']
del kwargs[param + '_value']
except:
values = ()
kwargs[param] = dict = {}
for key, value in zip(keys, values):
key = key.strip()
if key:
value = value.strip()
if value:
dict[key] = value
InitializeClass(Transform)
from AccessControl.Role import RoleManager
from AccessControl import ClassSecurityInfo
from Acquisition import Implicit
from Acquisition import aq_parent
from Acquisition import aq_base
from Globals import Persistent
from Globals import InitializeClass
from Globals import PersistentMapping
try:
from ZODB.PersistentList import PersistentList
except ImportError:
from persistent.list import PersistentList
from OFS.Folder import Folder
from OFS.SimpleItem import Item
from Products.PageTemplates.PageTemplateFile import PageTemplateFile
from Products.CMFCore.ActionProviderBase import ActionProviderBase
from Products.CMFCore.permissions import ManagePortal, View
from Products.CMFCore.utils import UniqueObject
from Products.CMFCore.utils import getToolByName
from Products.PortalTransforms.libtransforms.utils import MissingBinary
from Products.PortalTransforms import transforms
from Products.PortalTransforms.interfaces import iengine
from Products.PortalTransforms.interfaces import idatastream
from Products.PortalTransforms.interfaces import itransform
from Products.PortalTransforms.data import datastream
from Products.PortalTransforms.chain import TransformsChain
from Products.PortalTransforms.chain import chain
from Products.PortalTransforms.cache import Cache
from Products.PortalTransforms.Transform import Transform
from Products.PortalTransforms.utils import log
from Products.PortalTransforms.utils import TransformException
from Products.PortalTransforms.utils import BadRequest
from Products.PortalTransforms.utils import _www
__revision__ = '$Id: TransformEngine.py 6255 2006-04-11 15:29:29Z hannosch $'
from logging import DEBUG
class TransformTool(UniqueObject, ActionProviderBase, Folder):
id = 'portal_transforms'
meta_type = id.title().replace('_', ' ')
isPrincipiaFolderish = 1 # Show up in the ZMI
__implements__ = iengine
meta_types = all_meta_types = (
{ 'name' : 'Transform',
'action' : 'manage_addTransformForm'},
{ 'name' : 'TransformsChain',
'action' : 'manage_addTransformsChainForm'},
)
manage_addTransformForm = PageTemplateFile('addTransform', _www)
manage_addTransformsChainForm = PageTemplateFile('addTransformsChain', _www)
manage_cacheForm = PageTemplateFile('setCacheTime', _www)
manage_editTransformationPolicyForm = PageTemplateFile('editTransformationPolicy', _www)
manage_reloadAllTransforms = PageTemplateFile('reloadAllTransforms', _www)
manage_options = ((Folder.manage_options[0],) + Folder.manage_options[2:] +
(
{ 'label' : 'Caches',
'action' : 'manage_cacheForm'},
{ 'label' : 'Policy',
'action' : 'manage_editTransformationPolicyForm'},
{ 'label' : 'Reload transforms',
'action' : 'manage_reloadAllTransforms'},
)
)
security = ClassSecurityInfo()
def __init__(self, policies=None, max_sec_in_cache=3600):
self._mtmap = PersistentMapping()
self._policies = policies or PersistentMapping()
self.max_sec_in_cache = max_sec_in_cache
self._new_style_pt = 1
# mimetype oriented conversions (iengine interface) ########################
def unregisterTransform(self, name):
""" unregister a transform
name is the name of a registered transform
"""
self._unmapTransform(getattr(self, name))
if name in self.objectIds():
self._delObject(name)
def convertTo(self, target_mimetype, orig, data=None, object=None,
usedby=None, context=None, **kwargs):
"""Convert orig to a given mimetype
* orig is an encoded string
* data an optional idatastream object. If None a new datastream will be
created and returned
* optional object argument is the object on which is bound the data.
If present that object will be used by the engine to bound cached data.
* additional arguments (kwargs) will be passed to the transformations.
Some usual arguments are : filename, mimetype, encoding
return an object implementing idatastream or None if no path has been
found.
"""
target_mimetype = str(target_mimetype)
if object is not None:
cache = Cache(object)
data = cache.getCache(target_mimetype)
if data is not None:
time, data = data
if self.max_sec_in_cache == 0 or time < self.max_sec_in_cache:
return data
if data is None:
data = self._wrap(target_mimetype)
registry = getToolByName(self, 'mimetypes_registry')
if not getattr(aq_base(registry), 'classify', None):
# avoid problems when importing a site with an old mimetype registry
# XXX return None or orig?
return None
orig_mt = registry.classify(orig,
mimetype=kwargs.get('mimetype'),
filename=kwargs.get('filename'))
orig_mt = str(orig_mt)
if not orig_mt:
log('Unable to guess input mime type (filename=%s, mimetype=%s)' %(
kwargs.get('mimetype'), kwargs.get('filename')), severity=DEBUG)
return None
target_mt = registry.lookup(target_mimetype)
if target_mt:
target_mt = target_mt[0]
else:
log('Unable to match target mime type %s'% str(target_mimetype),
severity=DEBUG)
return None
## fastpath
# If orig_mt and target_mt are the same, we only allow
# a one-hop transform, a.k.a. filter.
# XXX disabled filtering for now
filter_only = False
if orig_mt == str(target_mt):
filter_only = True
data.setData(orig)
md = data.getMetadata()
md['mimetype'] = str(orig_mt)
if object is not None:
cache.setCache(str(target_mimetype), data)
return data
## get a path to output mime type
requirements = self._policies.get(str(target_mt), [])
path = self._findPath(orig_mt, target_mt, list(requirements))
if not path and requirements:
log('Unable to satisfy requirements %s' % ', '.join(requirements),
severity=DEBUG)
path = self._findPath(orig_mt, target_mt)
if not path:
log('NO PATH FROM %s TO %s : %s' % (orig_mt, target_mimetype, path),
severity=DEBUG)
return None #XXX raise TransformError
if len(path) > 1:
## create a chain on the fly (sly)
transform = chain()
for t in path:
transform.registerTransform(t)
else:
transform = path[0]
result = transform.convert(orig, data, context=context, usedby=usedby, **kwargs)
assert(idatastream.isImplementedBy(result),
'result doesn\'t implemented idatastream')
self._setMetaData(result, transform)
# set cache if possible
if object is not None and result.isCacheable():
cache.setCache(str(target_mimetype), result)
# return idatastream object
return result
security.declarePublic('convertToData')
def convertToData(self, target_mimetype, orig, data=None, object=None,
usedby=None, context=None, **kwargs):
"""Convert to a given mimetype and return the raw data
ignoring subobjects. see convertTo for more information
"""
data =self.convertTo(target_mimetype, orig, data, object, usedby,
context, **kwargs)
if data:
return data.getData()
return None
security.declarePublic('convert')
def convert(self, name, orig, data=None, context=None, **kwargs):
"""run a tranform of a given name on data
* name is the name of a registered transform
see convertTo docstring for more info
"""
if not data:
data = self._wrap(name)
try:
transform = getattr(self, name)
except AttributeError:
raise Exception('No such transform "%s"' % name)
data = transform.convert(orig, data, context=context, **kwargs)
self._setMetaData(data, transform)
return data
def __call__(self, name, orig, data=None, context=None, **kwargs):
"""run a transform by its name, returning the raw data product
* name is the name of a registered transform.
return an encoded string.
see convert docstring for more info on additional arguments.
"""
data = self.convert(name, orig, data, context, **kwargs)
return data.getData()
# utilities ###############################################################
def _setMetaData(self, datastream, transform):
"""set metadata on datastream according to the given transform
(mime type and optionaly encoding)
"""
md = datastream.getMetadata()
if hasattr(transform, 'output_encoding'):
md['encoding'] = transform.output_encoding
md['mimetype'] = transform.output
def _wrap(self, name):
"""wrap a data object in an icache"""
return datastream(name)
def _unwrap(self, data):
"""unwrap data from an icache"""
if idatastream.isImplementedBy(data):
data = data.getData()
return data
def _mapTransform(self, transform):
"""map transform to internal structures"""
registry = getToolByName(self, 'mimetypes_registry')
inputs = getattr(transform, 'inputs', None)
if not inputs:
raise TransformException('Bad transform %s : no input MIME type' %
(transform))
for i in inputs:
mts = registry.lookup(i)
if not mts:
msg = 'Input MIME type %r for transform %s is not registered '\
'in the MIME types registry' % (i, transform.name())
raise TransformException(msg)
for mti in mts:
for mt in mti.mimetypes:
mt_in = self._mtmap.setdefault(mt, PersistentMapping())
output = getattr(transform, 'output', None)
if not output:
msg = 'Bad transform %s : no output MIME type'
raise TransformException(msg % transform.name())
mto = registry.lookup(output)
if not mto:
msg = 'Output MIME type %r for transform %s is not '\
'registered in the MIME types registry' % \
(output, transform.name())
raise TransformException(msg)
if len(mto) > 1:
msg = 'Wildcarding not allowed in transform\'s output '\
'MIME type'
raise TransformException(msg)
for mt2 in mto[0].mimetypes:
try:
if not transform in mt_in[mt2]:
mt_in[mt2].append(transform)
except KeyError:
mt_in[mt2] = PersistentList([transform])
def _unmapTransform(self, transform):
"""unmap transform from internal structures"""
registry = getToolByName(self, 'mimetypes_registry')
for i in transform.inputs:
for mti in registry.lookup(i):
for mt in mti.mimetypes:
mt_in = self._mtmap.get(mt, {})
output = transform.output
mto = registry.lookup(output)
for mt2 in mto[0].mimetypes:
l = mt_in[mt2]
for i in range(len(l)):
if transform.name() == l[i].name():
l.pop(i)
break
else:
log('Can\'t find transform %s from %s to %s' % (
transform.name(), mti, mt),
severity=DEBUG)
def _findPath(self, orig, target, required_transforms=()):
"""return the shortest path for transformation from orig mimetype to
target mimetype
"""
path = []
if not self._mtmap:
return None
# naive algorithm :
# find all possible paths with required transforms
# take the shortest
#
# it should be enough since we should not have so much possible paths
shortest, winner = 9999, None
for path in self._getPaths(str(orig), str(target), required_transforms):
if len(path) < shortest:
winner = path
shortest = len(path)
return winner
def _getPaths(self, orig, target, requirements, path=None, result=None):
"""return a all path for transformation from orig mimetype to
target mimetype
"""
if path is None:
result = []
path = []
requirements = list(requirements)
outputs = self._mtmap.get(orig)
if outputs is None:
return result
path.append(None)
for o_mt, transforms in outputs.items():
for transform in transforms:
required = 0
name = transform.name()
if name in requirements:
requirements.remove(name)
required = 1
if transform in path:
# avoid infinite loop...
continue
path[-1] = transform
if o_mt == target:
if not requirements:
result.append(path[:])
else:
self._getPaths(o_mt, target, requirements, path, result)
if required:
requirements.append(name)
path.pop()
return result
security.declarePrivate('manage_afterAdd')
def manage_afterAdd(self, item, container):
""" overload manage_afterAdd to finish initialization when the
transform tool is added
"""
Folder.manage_afterAdd(self, item, container)
transforms.initialize(self)
# XXX required?
#try:
# # first initialization
# transforms.initialize(self)
#except:
# # may fail on copy
# pass
security.declareProtected(ManagePortal, 'manage_addTransform')
def manage_addTransform(self, id, module, REQUEST=None):
""" add a new transform to the tool """
transform = Transform(id, module)
self._setObject(id, transform)
self._mapTransform(transform)
if REQUEST is not None:
REQUEST['RESPONSE'].redirect(self.absolute_url()+'/manage_main')
security.declareProtected(ManagePortal, 'manage_addTransform')
def manage_addTransformsChain(self, id, description, REQUEST=None):
""" add a new transform to the tool """
transform = TransformsChain(id, description)
self._setObject(id, transform)
self._mapTransform(transform)
if REQUEST is not None:
REQUEST['RESPONSE'].redirect(self.absolute_url()+'/manage_main')
security.declareProtected(ManagePortal, 'manage_addTransform')
def manage_setCacheValidityTime(self, seconds, REQUEST=None):
"""set the lifetime of cached data in seconds"""
self.max_sec_in_cache = int(seconds)
if REQUEST is not None:
REQUEST['RESPONSE'].redirect(self.absolute_url()+'/manage_main')
security.declareProtected(ManagePortal, 'reloadTransforms')
def reloadTransforms(self, ids=()):
""" reload transforms with the given ids
if no ids, reload all registered transforms
return a list of (transform_id, transform_module) describing reloaded
transforms
"""
if not ids:
ids = self.objectIds()
reloaded = []
for id in ids:
o = getattr(self, id)
o.reload()
reloaded.append((id, o.module))
return reloaded
# Policy handling methods #################################################
def manage_addPolicy(self, output_mimetype, required_transforms, REQUEST=None):
""" add a policy for a given output mime types"""
registry = getToolByName(self, 'mimetypes_registry')
if not registry.lookup(output_mimetype):
raise TransformException('Unknown MIME type')
if self._policies.has_key(output_mimetype):
msg = 'A policy for output %s is yet defined' % output_mimetype
raise TransformException(msg)
required_transforms = tuple(required_transforms)
self._policies[output_mimetype] = required_transforms
if REQUEST is not None:
REQUEST['RESPONSE'].redirect(self.absolute_url()+'/manage_editTransformationPolicyForm')
def manage_delPolicies(self, outputs, REQUEST=None):
""" remove policies for given output mime types"""
for mimetype in outputs:
del self._policies[mimetype]
if REQUEST is not None:
REQUEST['RESPONSE'].redirect(self.absolute_url()+'/manage_editTransformationPolicyForm')
def listPolicies(self):
""" return the list of defined policies
a policy is a 2-uple (output_mime_type, [list of required transforms])
"""
# XXXFIXME: backward compat, should be removed latter
if not hasattr(self, '_policies'):
self._policies = PersistentMapping()
return self._policies.items()
# mimetype oriented conversions (iengine interface) ########################
def registerTransform(self, transform):
"""register a new transform
transform isn't a Zope Transform (the wrapper) but the wrapped transform
the persistence wrapper will be created here
"""
# needed when call from transform.transforms.initialize which
# register non zope transform
module = str(transform.__module__)
transform = Transform(transform.name(), module, transform)
if not itransform.isImplementedBy(transform):
raise TransformException('%s does not implement itransform' % transform)
name = transform.name()
__traceback_info__ = (name, transform)
if name not in self.objectIds():
self._setObject(name, transform)
self._mapTransform(transform)
security.declareProtected(ManagePortal, 'ZopeFind')
def ZopeFind(self, *args, **kwargs):
"""Don't break ZopeFind feature when a transform can't be loaded
"""
try:
return Folder.ZopeFind(self, *args, **kwargs)
except MissingBinary:
log('ZopeFind: catched MissingBinary exception')
security.declareProtected(View, 'objectItems')
def objectItems(self, *args, **kwargs):
"""Don't break ZopeFind feature when a transform can't be loaded
"""
try:
return Folder.objectItems(self, *args, **kwargs)
except MissingBinary:
log('objectItems: catched MissingBinary exception')
return []
InitializeClass(TransformTool)
from Products.PortalTransforms.TransformEngine import TransformTool
"""FIXME: backward compat, remove later
"""
from Products.PortalTransforms.chain import Chain as TransformsChain
import os.path
__version__ = open(os.path.join(__path__[0], 'version.txt')).read().strip()
from Products.PortalTransforms.utils import skins_dir
from Products.PortalTransforms.TransformEngine import TransformTool
GLOBALS = globals()
PKG_NAME = 'PortalTransforms'
tools = (
TransformTool,
)
# XXX backward compatibility tricks to make old PortalTransform based Mimetypes
# running (required)
import sys
this_module = sys.modules[__name__]
from Products.MimetypesRegistry import mime_types
setattr(this_module, 'mime_types', mime_types)
from Products.MimetypesRegistry import MimeTypeItem
setattr(this_module, 'MimeTypeItem', MimeTypeItem)
from Products.MimetypesRegistry import MimeTypeItem
sys.modules['Products.PortalTransforms.zope.MimeTypeItem'] = MimeTypeItem
def initialize(context):
from Products.CMFCore.DirectoryView import registerDirectory
#registerDirectory(skins_dir, GLOBALS)
from Products.CMFCore import utils
utils.ToolInit("%s Tool" % PKG_NAME,
tools=tools,
icon="tool.gif",
).initialize(context)
from Products import PortalTransforms as PRODUCT
import os.path
version=PRODUCT.__version__
modname=PRODUCT.__name__
# (major, minor, patchlevel, release info) where release info is:
# -99 for alpha, -49 for beta, -19 for rc and 0 for final
# increment the release info number by one e.g. -98 for alpha2
major, minor, bugfix = version.split('.')[:3]
bugfix, release = bugfix.split('-')[:2]
relinfo=-99 #alpha
if 'beta' in release:
relinfo=-49
if 'rc' in release:
relinfo=-19
if 'final' in release:
relinfo=0
numversion = (int(major), int(minor), int(bugfix), relinfo)
license = 'BSD like'
license_text = open(os.path.join(PRODUCT.__path__[0], 'LICENSE.txt')).read()
copyright = '''Copyright (c) 2003 LOGILAB S.A. (Paris, FRANCE)'''
author = "Archetypes developement team"
author_email = "archetypes-devel@lists.sourceforge.net"
short_desc = "MIME types based transformations for the CMF"
long_desc = """This package provides two new CMF tools in order to
make MIME types based transformations on the portal contents and so an
easy to way to plugin some new transformations for previously
unsupported content types. You will find more info in the package's
README and docs directory.
.
It's part of the Archetypes project, but the only requirement to use it
is to have a CMF based site. If you are using Archetypes, this package
replaces the transform package.
.
Notice this package can also be used as a standalone Python package. If
you've downloaded the Python distribution, you can't make it a Zope
product since Zope files have been removed from this distribution.
"""
web = "http://plone.org/products/archetypes"
ftp = ""
mailing_list = "archetypes-devel@lists.sourceforge.net"
debian_name = "zope-cmftransforms"
debian_maintainer = "Sylvain Thenault"
debian_maintainer_email = "sylvain.thenault@logilab.fr"
debian_handler = "zope"
<configure
xmlns="http://namespaces.zope.org/five"
>
<bridge
zope2=".interfaces.idatastream"
package=".z3.interfaces"
name="IDataStream"
/>
<bridge
zope2=".interfaces.itransform"
package=".z3.interfaces"
name="ITransform"
/>
<bridge
zope2=".interfaces.ichain"
package=".z3.interfaces"
name="IChain"
/>
<bridge
zope2=".interfaces.iengine"
package=".z3.interfaces"
name="IEngine"
/>
</configure>
"""Cache
"""
from time import time
from Acquisition import aq_base
class Cache:
def __init__(self, context, _id='_v_transform_cache'):
self.context = context
self._id =_id
def _genCacheKey(self, identifier, *args):
key = identifier
for arg in args:
key = '%s_%s' % (key, arg)
key = key.replace('/', '_')
key = key.replace('+', '_')
key = key.replace('-', '_')
key = key.replace(' ', '_')
return key
def setCache(self, key, value):
"""cache a value indexed by key"""
if not value.isCacheable():
return
context = self.context
key = self._genCacheKey(key)
if getattr(aq_base(context), self._id, None) is None:
setattr(context, self._id, {})
getattr(context, self._id)[key] = (time(), value)
return key
def getCache(self, key):
"""try to get a cached value for key
return None if not present
else return a tuple (time spent in cache, value)
"""
context = self.context
key = self._genCacheKey(key)
dict = getattr(context, self._id, None)
if dict is None :
return None
try:
orig_time, value = dict.get(key, None)
return time() - orig_time, value
except TypeError:
return None
def purgeCache(self, key=None):
"""Remove cache
"""
context = self.context
id = self._id
if not shasattr(context, id):
return
if key is None:
delattr(context, id)
else:
cache = getattr(context, id)
key = self._genCacheKey(key)
if cache.has_key(key):
del cache[key]
from Products.PageTemplates.PageTemplateFile import PageTemplateFile
from Globals import Persistent
from Globals import InitializeClass
from Acquisition import Implicit
from OFS.SimpleItem import Item
from AccessControl.Role import RoleManager
from AccessControl import ClassSecurityInfo
from Products.CMFCore.permissions import ManagePortal, ManageProperties
from Products.CMFCore.utils import getToolByName
from Products.PortalTransforms.utils import TransformException, _www
from Products.PortalTransforms.interfaces import ichain
from Products.PortalTransforms.interfaces import itransform
from UserList import UserList
class chain(UserList):
"""A chain of transforms used to transform data"""
__implements__ = (ichain, itransform)
def __init__(self, name='',*args):
UserList.__init__(self, *args)
self.__name__ = name
if args:
self._update()
def name(self):
return self.__name__
def registerTransform(self, transform):
self.append(transform)
def unregisterTransform(self, name):
for i in range(len(self)):
tr = self[i]
if tr.name() == name:
self.pop(i)
break
else:
raise Exception('No transform named %s registered' % name)
def convert(self, orig, data, **kwargs):
for transform in self:
data = transform.convert(orig, data, **kwargs)
orig = data.getData()
md = data.getMetadata()
md['mimetype'] = self.output
return data
def __setitem__(self, key, value):
UserList.__setitem__(self, key, value)
self._update()
def append(self, value):
UserList.append(self, value)
self._update()
def insert(self, *args):
UserList.insert(*args)
self._update()
def remove(self, *args):
UserList.remove(*args)
self._update()
def pop(self, *args):
UserList.pop(*args)
self._update()
def _update(self):
self.inputs = self[0].inputs
self.output = self[-1].output
for i in range(len(self)):
if hasattr(self[-i-1], 'output_encoding'):
self.output_encoding = self[-i-1].output_encoding
break
else:
try:
del self.output_encoding
except:
pass
class TransformsChain(Implicit, Item, RoleManager, Persistent):
""" a transforms chain is suite of transforms to apply in order.
It follows the transform API so that a chain is itself a transform.
"""
meta_type = 'TransformsChain'
meta_types = all_meta_types = ()
manage_options = (
({'label':'Configure',
'action':'manage_main'},
{'label':'Reload',
'action':'manage_reloadTransform'},) +
Item.manage_options
)
manage_main = PageTemplateFile('editTransformsChain', _www)
manage_reloadTransform = PageTemplateFile('reloadTransform', _www)
security = ClassSecurityInfo()
def __init__(self, id, description, ids=()):
self.id = id
self.description = description
self._object_ids = list(ids)
self.inputs = ('application/octet-stream',)
self.output = 'application/octet-stream'
self._chain = None
def __setstate__(self, state):
""" __setstate__ is called whenever the instance is loaded
from the ZODB, like when Zope is restarted.
We should rebuild the chain at this time
"""
TransformsChain.inheritedAttribute('__setstate__')(self, state)
self._chain = None
def _chain_init(self):
""" build the transforms chain """
tr_tool = getToolByName(self, 'portal_transforms')
self._chain = c = chain()
for id in self._object_ids:
object = getattr(tr_tool, id)
c.registerTransform(object)
self.inputs = c.inputs or ('application/octet-stream',)
self.output = c.output or 'application/octet-stream'
security.declarePublic('convert')
def convert(self, *args, **kwargs):
""" return apply the transform and return the result """
if self._chain is None:
self._chain_init()
return self._chain.convert(*args, **kwargs)
security.declarePublic('name')
def name(self):
"""return the name of the transform instance"""
return self.id
security.declarePrivate('manage_beforeDelete')
def manage_beforeDelete(self, item, container):
Item.manage_beforeDelete(self, item, container)
if self is item:
# unregister self from catalog on deletion
tr_tool = getToolByName(self, 'portal_transforms')
tr_tool.unregisterTransform(self.id)
security.declareProtected(ManagePortal, 'manage_addObject')
def manage_addObject(self, id, REQUEST=None):
""" add a new transform or chain to the chain """
assert id not in self._object_ids
self._object_ids.append(id)
self._chain_init()
if REQUEST is not None:
REQUEST['RESPONSE'].redirect(self.absolute_url()+'/manage_main')
security.declareProtected(ManagePortal, 'manage_delObjects')
def manage_delObjects(self, ids, REQUEST=None):
""" delete the selected mime types """
for id in ids:
self._object_ids.remove(id)
self._chain_init()
if REQUEST is not None:
REQUEST['RESPONSE'].redirect(self.absolute_url()+'/manage_main')
# transforms order handling #
security.declareProtected(ManagePortal, 'move_object_to_position')
def move_object_to_position(self, id, newpos):
""" overriden from OrderedFolder to store id instead of objects
"""
oldpos = self._object_ids.index(id)
if (newpos < 0 or newpos == oldpos or newpos >= len(self._object_ids)):
return 0
self._object_ids.pop(oldpos)
self._object_ids.insert(newpos, id)
self._chain_init()
return 1
security.declareProtected(ManageProperties, 'move_object_up')
def move_object_up(self, id, REQUEST=None):
""" move object with the given id up in the list """
newpos = self._object_ids.index(id) - 1
self.move_object_to_position(id, newpos)
if REQUEST is not None:
REQUEST['RESPONSE'].redirect(self.absolute_url()+'/manage_main')
security.declareProtected(ManageProperties, 'move_object_down')
def move_object_down(self, id, REQUEST=None):
""" move object with the given id down in the list """
newpos = self._object_ids.index(id) + 1
self.move_object_to_position(id, newpos)
if REQUEST is not None:
REQUEST['RESPONSE'].redirect(self.absolute_url()+'/manage_main')
# Z transform interface #
security.declareProtected(ManagePortal, 'reload')
def reload(self):
""" reload the module where the transformation class is defined """
for tr in self.objectValues():
tr.reload()
# utilities #
security.declareProtected(ManagePortal, 'listAddableObjectIds')
def listAddableObjectIds(self):
""" return a list of addable transform """
tr_tool = getToolByName(self, 'portal_transforms')
return [id for id in tr_tool.objectIds() if not (id == self.id or id in self._object_ids)]
security.declareProtected(ManagePortal, 'objectIds')
def objectIds(self):
""" return a list of addable transform """
return tuple(self._object_ids)
security.declareProtected(ManagePortal, 'objectValues')
def objectValues(self):
""" return a list of addable transform """
tr_tool = getToolByName(self, 'portal_transforms')
return [getattr(tr_tool, id) for id in self.objectIds()]
InitializeClass(TransformsChain)
<configure xmlns="http://namespaces.zope.org/zope"
xmlns:five="http://namespaces.zope.org/five">
<include file="bridge.zcml"/>
<include file="implements.zcml"/>
<five:deprecatedManageAddDelete
class=".TransformEngine.TransformTool" />
<five:deprecatedManageAddDelete
class=".Transform.Transform" />
</configure>
from Products.PortalTransforms.interfaces import idatastream
class datastream:
"""A transformation datastream packet"""
__implements__ = idatastream
__slots__ = ('name', '_data', '_metadata')
def __init__(self, name):
self.__name__ = name
self._data = ''
self._metadata = {}
self._objects = {}
self._cacheable = 1
def __str__(self):
return self.getData()
def name(self):
return self.__name__
def setData(self, value):
"""set the main data produced by a transform, i.e. usually a string"""
self._data = value
def getData(self):
"""provide access to the transformed data object, i.e. usually a string.
This data may references subobjects.
"""
if callable(self._data):
data = self._data()
else:
data = self._data
return data
def setSubObjects(self, objects):
"""set a dict-like object containing subobjects.
keys should be object's identifier (e.g. usually a filename) and
values object's content.
"""
self._objects = objects
def getSubObjects(self):
"""return a dict-like object with any optional subobjects associated
with the data"""
return self._objects
def getMetadata(self):
"""return a dict-like object with any optional metadata from
the transform"""
return self._metadata
def isCacheable(self):
"""Return a bool which indicates wether the result should be cached
Default is true
"""
return self._cacheable
def setCacheable(self, value):
"""Set cacheable flag to yes or no
"""
self._cacheable = not not value
#data = property('getData', 'setData', None, """idata.data""")
#metadata = property('getMetadata', 'setMetadata', None,
#"""idata.metadata""")
===================================
Portal Transforms'Developper manual
===================================
:Author: Sylvain Thenault
:Contact: syt@logilab.fr
:Date: $Date: 2005-08-19 23:43:41 +0200 (Fre, 19 Aug 2005) $
:Version: $Revision: 1.5 $
:Web site: http://sourceforge.net/projects/archetypes
.. contents::
Tools interfaces
----------------
The MIME types registry
```````````````````````
class isourceAdapter(Interface):
def __call__(data, \**kwargs):
"""convert data to unicode, may take optional kwargs to aid in conversion"""
class imimetypes_registry(isourceAdapter):
def classify(data, mimetype=None, filename=None):
"""return a content type for this data or None
None should rarely be returned as application/octet can be
used to represent most types.
"""
def lookup(mimetypestring):
"""Lookup for imimetypes object matching mimetypestring.
mimetypestring may have an empty minor part or containing a wildcard (*).
mimetypestring may be an imimetype object (in this case it will be
returned unchanged, else it should be a RFC-2046 name.
Return a list of mimetypes objects associated with the RFC-2046 name.
Return an empty list if no one is known.
"""
def lookupExtension(filename):
""" return the mimetypes object associated with the file's extension
or None if it is not known.
filename maybe a file name like 'content.txt' or an extension like 'rest'
"""
def mimetypes():
"""return all defined mime types, each one implements at least imimetype
"""
def list_mimetypes():
"""return all defined mime types, as string"""
The tranformation tool
``````````````````````
class iengine(Interface):
def registerTransform(transform):
"""register a transform
transform must implements itransform
"""
def unregisterTransform(name):
""" unregister a transform
name is the name of a registered transform
"""
def convertTo(mimetype, orig, idata=None, \**kwargs):
"""Convert orig to a given mimetype
return an object implementing idatastream or None if not path has been
found
"""
def convert(name, orig, idata=None, \**kwargs):
"""run a tranform of a given name on data
name is the name of a registered transform
return an object implementing idatastream
"""
def __call__(name, orig, idata=None, \**kwargs):
"""run a transform returning the raw data product
name is the name of a registered transform
return an object implementing idatastream
"""
Writing a new transformation
----------------------------
Writing a new transform should be an easy task. You only have to follow a
simple interface to do it, but knowing some advanced features and provided
utilities may help to do it quicker...
Related interfaces
``````````````````
class itransform(Interface):
"""A transformation plugin -- tranform data somehow must be threadsafe and stateless"""
inputs = Attribute("""list of imimetypes (or registered rfc-2046
names) this transform accepts as inputs""")
output = Attribute("output imimetype as instance or rfc-2046 name"")
def name(self):
"""return the name of the transform instance"""
def convert(data, idata, \**kwargs):
"""convert the data, store the result in idata and return that"""
class idatastream(Interface):
"""data stream, is the result of a transform"""
def setData(self, value):
"""set the main data produced by a transform, i.e. usually a string"""
def getData():
"""provide access to the transformed data object, i.e. usually a string.
This data may references subobjects.
"""
def setSubObjects(self, objects):
"""set a dict-like object containing subobjects.
keys should be object's identifier (e.g. usually a filename) and
values object's content.
"""
def getSubObjects(self):
"""return a dict-like object with any optional subobjects associated
with the data"""
def getMetadata():
"""return a dict-like object with any optional metadata from
the transform"""
Important note about encoding
`````````````````````````````
A transform receive data as an encoded string. A priori, no assumption can be
made about the used encoding. Data returned by a transform must use the same
encoding as received data, unless the transform provides a *output_encoding*
attribute indicating the output encoding (for instance this may be usefull for
XSLT based transforms).
Configurable transformation
```````````````````````````
You can make your transformation configurable through the ZMI by setting a
*config* dictionnary on your transform instance or class. Keys are parameter's
name and values parameter's value. Another dictionnary *config_metadata*
describes each parameter. In this mapping, keys are also parameter's name but
values are a tree-uple : (<parameter's type>, <parameter's label>, <parameter's
description>).
Possible types for parameters are :
:int: field is an integer
:string: field is a string
:list: field is a list
:dict: field is a dictionnary
You can look at the **command** and **xml** transforms for an example of
configurable transform.
Images / sub-objects management
````````````````````````````````
A transformation may produce some sub-objects, for instance when you convert a
PDF document to HTML. That's the purpose of the setObjects method of
the idatastream interface.
Some utilities
``````````````
Transform utilities may be found in the libtransforms subpackage. You'll find
there the following modules :
*commandtransform*
provides a base class for external command based transforms.
*retransform*
provides a base class for regular expression based transforms.
*html4zope*
provides a docutils HTML writer for Zope.
*utils*
provides some utilities functions.
Write a test your transform !
`````````````````````````````
Every transform should have its test... And it's easy to write a test for your
transform ! Imagine you have made a transform named "colabeer" which transforms
cola into beer (I let you find MIME type for these content types ;). Basically,
your test file should be :
from test_transforms import make_tests
tests =('Products.MyTransforms.colabeer', "drink.cola", "drink.beer", None, 0)
def test_suite():
return TestSuite([makeSuite(test) for test in make_tests()])
if __name__=='__main__':
main(defaultTest='test_suite')
In this example :
- "Products.MyTransforms.colabeer" is the module defining your transform (you
can also give directly the transform instance).
- "drink.cola" is the name of the file containing data to give to your transform
as input.
- "drink.beer" is the file containing expected transform result (what the getData
method of idatastream will return).
- Additional arguments (*None* and *0*) are respectivly an optional normalizing
function to apply to both the transform result and the output file content, and
the number of subobjects that the transform is expected to produce.
This example supposes your test is in the *tests* directory of PortalTransforms
and your input and output files respectively in *tests/input* and
*tests/output*.
\ No newline at end of file
REST2HTML=html.py --compact-lists --date --generator
all: user_manual.html dev_manual.html
user_manual.html: user_manual.rst
$(REST2HTML) user_manual.rst user_manual.html
dev_manual.html: dev_manual.rst
$(REST2HTML) dev_manual.rst dev_manual.html
clean:
rm *.html
===================================
How to setup PyUNO for zope
===================================
:Author: Junyong Pan <panjy at zopechina.com>, Anton Stonor <stonor@giraffen.dk>
:Date: $Date: 2003-08-12 02:50:50 -0800 (Tue, 12 Aug 2003) $
:Version: $Revision: 1.5 $
(to be refined)
Portal Transforms allows you to convert Word documents to HTML. A cool
feature
if you want to preview Word documents at your web site or use Word as a web
authoring tool.
To do the actual transform, Portal Transforms rely on a third party
application
to do the heavy lifting. If you have not installed such an application,
Portal
Transforms will not perfom Word to HTML transforms.
One of the options is Open Office. It is not the easiest application to
set up
to work with Portal Transforms, but it works on both Windows and Unix
and delivers
fairly good HTML.
Problems
====================
- PyUNO is cool, but PyUNO now ship with its own python intepreter, which is not compatible with zope's
- PyUNO is not threadsafe now.
SETTING UP OPEN OFFICE ON WINDOWS
=======================================
WARNING: You can setup pyuno, but you can't use it concurrently. see 'Install oood'
1) Install Open Office 2.0
Just run the standard installer.
Pyuno in this version ship with python 2.3, which is compatible with Zope 2.7
2) Set the environment PATH
Add the Open Office program dir to the Windows PATH, e.g.
C:\Program Files\OpenOffice.org 1.9.82\program
See this article on how to set the Windows PATH:
http://vlaurie.com/computers2/Articles/environment.htm
You can also look at the python.bat (located in your Open Office program
dir)
for inspiration.
3) Set the PYTHONPATH
You need to add these directories to the PYTHONPATH:
a) The Open Office program dir (e.g. C:\Program Files\OpenOffice.org
1.9.82\program)
b) The Open Office python lib dir (e.g. C:\Program Files\OpenOffice.org
1.9.82\program\python-core-2.3.4\lib)
From the Windows system shell, just run e.g.:
set PYTHONPATH= C:\Program Files\OpenOffice.org 1.9.82\program
set PYTHONPATH= C:\Program Files\OpenOffice.org
1.9.82\program\python-core-2.3.4\lib
You can also look at the python.bat (located in your Open Office program
dir) for inspiration.
4) Start Open Office as UNO-server
Run soffice "-accept=socket,host=localhost,port=2002;urp;"
5) Now it should work
For Debian Linux Users
=========================
see: http://bibus-biblio.sourceforge.net/html/en/LinuxInstall.html
1. install version 1.1, which doesn't contain pyuno::
apt-get install openoffice
2. install a version of pyuno which enable ucs4 unicode
- you can download at http://sourceforge.net/projects/bibus-biblio/
- copy to /usr/lib/openoffice/program
3. set up environment variables
OPENOFFICE_PATH="/usr/lib/openoffice/program"
export PYTHONPATH="$OPENOFFICE_PATH"
export LD_LIBRARY_PATH="$OPENOFFICE_PATH"
Install oood
===================
Note, this product is for linux only
http://udk.openoffice.org/python/oood/
UNDERSTANDING OPEN OFFICE AND UNO
=============================================
Open Office allows programmers to remotely control it. Portal Transforms
takes
advantage of this opportunity by scripting Open Office from Python. It
is possible
through PyUNO that exposes the Open Office API in Python.
Now, you can't download and install PyUNO as a module for your the Python
interpreter that is running your Zope server. PyUNO only comes bundled
with Open
Office and the Python that is distributed with Open Office. To make
PyUNO work
from within your standard Python you must expand the PYTHONPATH as done
above so
Python also will look inside Open Office for modules. If it works you
should be
able to start up a Python shell and do
>>>import uno
In some cases you can be unlucky and the Python used for Zope is not in
sync with
the Python that is distributed with Open Office. That is solved by
rebuilding
Python -- a task that is beyond the scope of this guide.
=============================
Portal Transforms'User manual
=============================
:Author: Sylvain Thénault
:Contact: syt@logilab.fr
:Date: $Date: 2005-08-19 23:43:41 +0200 (Fre, 19 Aug 2005) $
:Version: $Revision: 1.7 $
:Web site: http://sourceforge.net/projects/archetypes
.. contents::
What does this package provide ?
================================
This package is both a python library for MIME type based content
transformation, including a command line tool, (what i call the python package)
and a Zope product providing two new tools for the CMF (what i call the Zope
product). A python only distribution will be available, where all the Zope
specific files won't be included.
Python side
===========
The *transform* command line tool
`````````````````````````````````
command line tool for MIME type based transformation
USAGE: transform [OPTIONS] input_file output_file
OPTIONS:
-h / --help
display this help message and exit.
-o / --output <output mime type>
output MIME type. (conflict with --transform)
-t / --transform <transform id>
id of the transform to apply. (conflict with --output)
EXAMPLE:
$ transform -o text/html dev_manual.rst dev_manual.html
Customization hook
``````````````````
You can customize the transformation engine by providing a module
"transform_customize" somewhere in your Python path. The module must provide a
*initialize* method which will take the engine as only argument. This method
will have the reponsability to initialize the engine with desired
transformations. When it's not found, the *initialize* method from the
*transforms* subpackage will be used.
Zope side
=========
The MIME types registry
```````````````````````
This tool registered known MIME types. The information associated with a MIME
type are :
* a title
* a list rfc-2046 types
* a list of files extensions
* a binary flag
* an icon path
You can see regitered types by going to the *mimetypes_registry* object at the
root of your CMF site, using the ZMI. There you can modify existent information
or add / delete types. This product cames with a default set of MIME types icons
located in portal_skins/mimetypes_icons.
The tranformation tool
``````````````````````
It's a MIME type based transformation engine. It's has been designed to
transform portal content from a given MIME type to another. You can add / delete
transformations by going to the *portal_transforms* object at the root of your
CMF site, using the ZMI. Some transformations are configurable, but not all. A
transform is a Python object implementing a special interface. See the
developper documentation if you're interested in writing a new
transformation.
Archetypes integration
``````````````````````
Archetypes will use this product for automatic transformation if you have
configurated it to use the new base unit (set USE_NEW_BASEUNIT to 1 in the
Archetypes config.py). If you're using the old base unit (still default in 1.0),
the transform tool won't be used (at least by the Archetypes library).
Default transformations
=======================
The default transformations are described here. They are separated in two groups,
safe and unsafe. Safe transforms are located in the *transforms* directory of this
product. Unsafe transforms are located in the *unsafe_transforms* directory and
are not registered by default. Moreover, there is no __init__.py file in this
directory so it requires a manual intervention to make them addable to the
transforms tool. Usually unsafe transforms are so called since they allow
configuration of a path to a binary executable on the server, which may be
indesirable for Zope service providers.
Safe transforms
```````````````
*st*
transforms Structured Text to HTML. Not configurable.
*rest*
transforms Re Structured Text to HTML. You need docutils to use this
transformation. Not configurable.
*word_to_html*
transforms M$ Word file to HTML, using either COM (on windows), wvWare or
PyUNO (from OpenOffice.org). Not configurable.
*pdf_to_html*
transforms Adobe PDF to HTML. This transforms requires the "pdftohtml"
program. Not Configurable.
*lynx_dump*
transforms HTML to plain text. This transforms requires the "lynx"
program. Not Configurable.
*python*
transforms Python source code to colorized HTML. You can configure used
colors.
*text_to_html*
transforms plain text file to HTML by replacing new lines with
<br/>. You can configure allowable inputs for this transform.
*rest_to_text*
This is an example use of the *identity* transform, which does
basically nothing :). It's used here to transform ReST files
(text/x-rst) to text/plain. You can configure allowable inputs and
outuput on this transform.
Unsafe transforms
`````````````````
*command*
this is a fully configurable transform based on external commands. For
instance, you can obtain the same transformation as the previous
*lynx_dump*:
1. add a new transform named "lynx_dump" with
"Products.PortalTransforms.unsafe_transforms.command" as module
(this supposes that you've added a __init__.py file to the
unsafe_transforms directory to make them importable).
2. go to the configure tab of this transform and set the following
parameters :
:binary_path: '/usr/bin/lynx'
:command_line: '-dump %s'
:input: 'text/html'
:output: 'text/plain'
*xml*
this transform has been designed to handle XML file on a doctype / DTD
basis. All the real transformation work is done by a xslt processor. This
transform only associate XSLT to doctypes or DTD, and use give the correct
transformation to the processor when some content has to be
transformed.
FIXME: add an example on how to setup docbook transform.
Advanced features
=================
Transformation chains
`````````````````````
A transformation chain is an ordered suite of transformations. A chain
itselve is a transformation. You can build a transformations chain
using the ZMI.
Transformation policy
`````````````````````
You can set a simple transformation policies for the transforms
tool. A policy say that when you try to convert content to a given
MIME type, you have to include a given transformation. For instance,
imagine you have a *html_tidy* tranformation which tidy HTML page, you
can say that the transformation path to text/html should include the
*html_tidy* transform.
Caches
``````
For efficiency, transformation's result are cached. You can set the
life time of a cached result using the ZMI. This is a time exprimed in
seconds.
<configure
xmlns="http://namespaces.zope.org/five"
>
<implements
class=".chain.chain"
interface=".z3.interfaces.IChain"
/>
<implements
class=".chain.chain"
interface=".z3.interfaces.ITransform"
/>
<implements
class=".data.datastream"
interface=".z3.interfaces.IDataStream"
/>
<implements
class=".Transform.Transform"
interface=".z3.interfaces.ITransform"
/>
<implements
class=".TransformEngine.TransformTool"
interface=".z3.interfaces.IEngine"
/>
<implements
class=".libtransforms.commandtransform.commandtransform"
interface=".z3.interfaces.ITransform"
/>
<implements
class=".libtransforms.commandtransform.popentransform"
interface=".z3.interfaces.ITransform"
/>
<implements
class=".libtransforms.retransform.retransform"
interface=".z3.interfaces.ITransform"
/>
<!-- TODO: more -->
</configure>
from Interface import Interface, Attribute
class idatastream(Interface):
"""data stream, is the result of a transform"""
def setData(value):
"""set the main data produced by a transform, i.e. usually a string"""
def getData():
"""provide access to the transformed data object, i.e. usually a string.
This data may references subobjects.
"""
def setSubObjects(objects):
"""set a dict-like object containing subobjects.
keys should be object's identifier (e.g. usually a filename) and
values object's content.
"""
def getSubObjects():
"""return a dict-like object with any optional subobjects associated
with the data"""
def getMetadata():
"""return a dict-like object with any optional metadata from
the transform
You can modify the returned dictionnary to add/change metadata
"""
def isCacheable():
"""Return a bool which indicates wether the result should be cached
Default is true
"""
def setCachable(value):
"""Set cacheable flag to yes or no
"""
class itransform(Interface):
"""A transformation plugin -- tranform data somehow
must be threadsafe and stateless"""
# inputs = Attribute("""list of imimetypes (or registered rfc-2046
# names) this transform accepts as inputs.""")
# output = Attribute("""output imimetype as instance or rfc-2046
# name""")
# output_encoding = Attribute("""output encoding of this transform.
# If not specified, the transform should output the same encoding as received data
# """)
def name(self):
"""return the name of the transform instance"""
def convert(data, idata, filename=None, **kwargs):
"""convert the data, store the result in idata and return that
optional argument filename may give the original file name of received data
additional arguments given to engine's convert, convertTo or __call__ are
passed back to the transform
The object on which the translation was invoked is available as context
(default: None)
"""
class ichain(itransform):
def registerTransform(transform, condition=None):
"""Append a transform to the chain"""
class iengine(Interface):
def registerTransform(transform):
"""register a transform
transform must implements itransform
"""
def unregisterTransform(name):
""" unregister a transform
name is the name of a registered transform
"""
def convertTo(mimetype, orig, data=None, object=None, context=None, **kwargs):
"""Convert orig to a given mimetype
* orig is an encoded string
* data an optional idatastream object. If None a new datastream will be
created and returned
* optional object argument is the object on which is bound the data.
If present that object will be used by the engine to bound cached data.
* optional context argument is the object on which the transformation
was called.
* additional arguments (kwargs) will be passed to the transformations.
return an object implementing idatastream or None if no path has been
found.
"""
def convert(name, orig, data=None, context=None, **kwargs):
"""run a tranform of a given name on data
* name is the name of a registered transform
see convertTo docstring for more info
"""
def __call__(name, orig, data=None, context=None, **kwargs):
"""run a transform by its name, returning the raw data product
* name is the name of a registered transform.
return an encoded string.
see convert docstring for more info on additional arguments.
"""
""" package containing some utilities which aim to facilitae transformation writing
"""
import os
import sys
import tempfile
import re
import shutil
from os.path import join, basename
from Products.PortalTransforms.interfaces import itransform
from Products.PortalTransforms.libtransforms.utils import bin_search, sansext, getShortPathName
class commandtransform:
"""abstract class for external command based transform
"""
__implements__ = itransform
def __init__(self, name=None, binary=None, **kwargs):
if name is not None:
self.__name__ = name
if binary is not None:
self.binary = bin_search(binary)
self.binary = getShortPathName(self.binary)
def name(self):
return self.__name__
def initialize_tmpdir(self, data, **kwargs):
"""create a temporary directory, copy input in a file there
return the path of the tmp dir and of the input file
"""
tmpdir = tempfile.mktemp()
os.mkdir(tmpdir)
filename = kwargs.get("filename", '')
fullname = join(tmpdir, basename(filename))
filedest = open(fullname , "wb").write(data)
return tmpdir, fullname
def subObjects(self, tmpdir):
imgs = []
for f in os.listdir(tmpdir):
result = re.match("^.+\.(?P<ext>.+)$", f)
if result is not None:
ext = result.group('ext')
if ext in ('png', 'jpg', 'gif'):
imgs.append(f)
path = join(tmpdir, '')
return path, imgs
def fixImages(self, path, images, objects):
for image in images:
objects[image] = open(join(path, image), 'rb').read()
def cleanDir(self, tmpdir):
shutil.rmtree(tmpdir)
class popentransform:
"""abstract class for external command based transform
Command must read from stdin and write to stdout
"""
__implements__ = itransform
binaryName = ""
binaryArgs = ""
useStdin = True
def __init__(self, name=None, binary=None, binaryArgs=None, useStdin=None,
**kwargs):
if name is not None:
self.__name__ = name
if binary is not None:
self.binary = bin_search(binary)
else:
self.binary = bin_search(self.binaryName)
if binaryArgs is not None:
self.binaryArgs = binaryArgs
if useStdin is not None:
self.useStdin = useStdin
def name(self):
return self.__name__
def getData(self, couterr):
return couterr.read()
def convert(self, data, cache, **kwargs):
command = "%s %s" % (self.binary, self.binaryArgs)
if not self.useStdin:
tmpfile, tmpname = tempfile.mkstemp(text=False) # create tmp
os.write(tmpfile, data) # write data to tmp using a file descriptor
os.close(tmpfile) # close it so the other process can read it
command = command % { 'infile' : tmpname } # apply tmp name to command
cin, couterr = os.popen4(command, 'b')
if self.useStdin:
cin.write(str(data))
status = cin.close()
out = self.getData(couterr)
couterr.close()
if not self.useStdin:
# remove tmp file
os.unlink(tmpname)
cache.setData(out)
return cache
from Products.PortalTransforms.interfaces import itransform
from StringIO import StringIO
import PIL.Image
class PILTransforms:
__implements__ = itransform
__name__ = "piltransforms"
def __init__(self, name=None):
if name is not None:
self.__name__ = name
def name(self):
return self.__name__
def convert(self, orig, data, **kwargs):
imgio = StringIO()
orig = StringIO(orig)
newwidth = kwargs.get('width',None)
newheight = kwargs.get('height',None)
pil_img = PIL.Image.open(orig)
if(self.format in ['jpeg','ppm']):
pil_img.draft("RGB", pil_img.size)
pil_img = pil_img.convert("RGB")
if(newwidth or newheight):
pil_img.thumbnail((newwidth,newheight),PIL.Image.ANTIALIAS)
pil_img.save(imgio,self.format)
data.setData(imgio.getvalue())
return data
def register():
return PILTransforms()
from Products.PortalTransforms.interfaces import itransform
import re
class retransform:
"""abstract class for regex transforms (re.sub wrapper)"""
__implements__ = itransform
inputs = ('text/',)
def __init__(self, name, *args):
self.__name__ = name
self.regexes = []
for pat, repl in args:
self.addRegex(pat, repl)
def name(self):
return self.__name__
def addRegex(self, pat, repl):
r = re.compile(pat)
self.regexes.append((r, repl))
def convert(self, orig, data, **kwargs):
for r, repl in self.regexes:
orig = r.sub(repl, orig)
data.setData(orig)
return data
import re
import os
import sys
from sgmllib import SGMLParser
try:
import win32api
WIN32 = True
except ImportError:
WIN32 = False
class MissingBinary(Exception): pass
envPath = os.environ['PATH']
bin_search_path = [path for path in envPath.split(os.pathsep)
if os.path.isdir(path)]
cygwin = 'c:/cygwin'
# cygwin support
if sys.platform == 'win32' and os.path.isdir(cygwin):
for p in ['/bin', '/usr/bin', '/usr/local/bin' ]:
p = os.path.join(cygwin, p)
if os.path.isdir(p):
bin_search_path.append(p)
if sys.platform == 'win32':
extensions = ('.exe', '.com', '.bat', )
else:
extensions = ()
def bin_search(binary):
"""search the bin_search_path for a given binary returning its fullname or
raises MissingBinary"""
result = None
mode = os.R_OK | os.X_OK
for path in bin_search_path:
for ext in ('', ) + extensions:
pathbin = os.path.join(path, binary) + ext
if os.access(pathbin, mode) == 1:
result = pathbin
break
if not result:
raise MissingBinary('Unable to find binary "%s" in %s' %
(binary, os.pathsep.join(bin_search_path)))
else:
return result
def getShortPathName(binary):
if WIN32:
try:
binary = win32api.GetShortPathName(binary)
except win32api.error:
log("Failed to GetShortPathName for '%s'" % binary)
return binary
def sansext(path):
return os.path.splitext(os.path.basename(path))[0]
##########################################################################
# The code below is taken from CMFDefault.utils to remove
# dependencies for Python-only installations
##########################################################################
def bodyfinder(text):
""" Return body or unchanged text if no body tags found.
Always use html_headcheck() first.
"""
lowertext = text.lower()
bodystart = lowertext.find('<body')
if bodystart == -1:
return text
bodystart = lowertext.find('>', bodystart) + 1
if bodystart == 0:
return text
bodyend = lowertext.rfind('</body>', bodystart)
if bodyend == -1:
return text
return text[bodystart:bodyend]
#
# HTML cleaning code
#
# These are the HTML tags that we will leave intact
VALID_TAGS = { 'a' : 1
, 'b' : 1
, 'base' : 0
, 'blockquote' : 1
, 'body' : 1
, 'br' : 0
, 'caption' : 1
, 'cite' : 1
, 'code' : 1
, 'div' : 1
, 'dl' : 1
, 'dt' : 1
, 'dd' : 1
, 'em' : 1
, 'h1' : 1
, 'h2' : 1
, 'h3' : 1
, 'h4' : 1
, 'h5' : 1
, 'h6' : 1
, 'head' : 1
, 'hr' : 0
, 'html' : 1
, 'i' : 1
, 'img' : 0
, 'kbd' : 1
, 'li' : 1
, 'meta' : 0
, 'ol' : 1
, 'p' : 1
, 'pre' : 1
, 'span' : 1
, 'strong' : 1
, 'table' : 1
, 'tbody' : 1
, 'td' : 1
, 'th' : 1
, 'title' : 1
, 'tr' : 1
, 'tt' : 1
, 'u' : 1
, 'ul' : 1
}
NASTY_TAGS = { 'script' : 1
, 'object' : 1
, 'embed' : 1
, 'applet' : 1
}
class IllegalHTML( ValueError ):
pass
class StrippingParser( SGMLParser ):
""" Pass only allowed tags; raise exception for known-bad. """
from htmlentitydefs import entitydefs # replace entitydefs from sgmllib
def __init__( self ):
SGMLParser.__init__( self )
self.result = ""
def handle_data( self, data ):
if data:
self.result = self.result + data
def handle_charref( self, name ):
self.result = "%s&#%s;" % ( self.result, name )
def handle_entityref(self, name):
if self.entitydefs.has_key(name):
x = ';'
else:
# this breaks unstandard entities that end with ';'
x = ''
self.result = "%s&%s%s" % (self.result, name, x)
def unknown_starttag(self, tag, attrs):
""" Delete all tags except for legal ones.
"""
if VALID_TAGS.has_key(tag):
self.result = self.result + '<' + tag
for k, v in attrs:
if k.lower().startswith( 'on' ):
raise IllegalHTML, 'Javascipt event "%s" not allowed.' % k
if v.lower().startswith( 'javascript:' ):
raise IllegalHTML, 'Javascipt URI "%s" not allowed.' % v
self.result = '%s %s="%s"' % (self.result, k, v)
endTag = '</%s>' % tag
if VALID_TAGS.get(tag):
self.result = self.result + '>'
else:
self.result = self.result + ' />'
elif NASTY_TAGS.get( tag ):
raise IllegalHTML, 'Dynamic tag "%s" not allowed.' % tag
else:
pass # omit tag
def unknown_endtag(self, tag):
if VALID_TAGS.get( tag ):
self.result = "%s</%s>" % (self.result, tag)
remTag = '</%s>' % tag
def scrubHTML( html ):
""" Strip illegal HTML tags from string text. """
parser = StrippingParser()
parser.feed( html )
parser.close()
return parser.result
##############################################################################
#
# Copyright (c) 2001 Zope Corporation and Contributors. All Rights Reserved.
#
# This software is subject to the provisions of the Zope Public License,
# Version 2.0 (ZPL). A copy of the ZPL should accompany this distribution.
# THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED
# WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
# WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS
# FOR A PARTICULAR PURPOSE
#
##############################################################################
"""Wrapper to integrate reStructuredText into Zope
This implementation requires docutils 0.3.4+ from http://docutils.sf.net/
Based on the new implementation of Zope 2.7.1 altered for PortalTransforms
"""
try:
import docutils
except ImportError:
raise ImportError, 'Please install docutils 0.3.3+ from http://docutils.sourceforge.net/#download.'
version = docutils.__version__.split('.')
if version < ['0', '3', '3']:
raise ImportError, """Old version of docutils found:
Got: %(version)s, required: 0.3.3+
Please remove docutils from %(path)s and replace it with a new version. You
can download docutils at http://docutils.sourceforge.net/#download.
""" % {'version' : docutils.__version__, 'path' : docutils.__path__[0] }
import sys, os, locale
##from App.config import getConfiguration
from docutils.core import publish_parts
# get encoding
##default_enc = sys.getdefaultencoding()
##default_output_encoding = getConfiguration().rest_output_encoding or default_enc
##default_input_encoding = getConfiguration().rest_input_encoding or default_enc
default_enc = 'utf-8'
default_output_encoding = default_enc
default_input_encoding = default_enc
# starting level for <H> elements (default behaviour inside Zope is <H3>)
default_level = 3
##initial_header_level = getConfiguration().rest_header_level or default_level
initial_header_level = default_level
# default language
##default_lang = getConfiguration().locale or locale.getdefaultlocale()[0]
default_lang = locale.getdefaultlocale()[0]
if default_lang and '_' in default_lang:
default_lang = default_lang[:default_lang.index('_')]
class Warnings:
def __init__(self):
self.messages = []
def write(self, message):
self.messages.append(message)
def render(src,
writer='html4css1',
report_level=1,
stylesheet='default.css',
input_encoding=default_input_encoding,
output_encoding=default_output_encoding,
language_code=default_lang,
initial_header_level = initial_header_level,
settings = {}):
"""get the rendered parts of the document the and warning object
"""
# Docutils settings:
settings = settings.copy()
settings['input_encoding'] = input_encoding
settings['output_encoding'] = output_encoding
settings['stylesheet'] = stylesheet
settings['language_code'] = language_code
# starting level for <H> elements:
settings['initial_header_level'] = initial_header_level + 1
# set the reporting level to something sane:
settings['report_level'] = report_level
# don't break if we get errors:
settings['halt_level'] = 6
# remember warnings:
settings['warning_stream'] = warning_stream = Warnings()
parts = publish_parts(source=src, writer_name=writer,
settings_overrides=settings,
config_section='zope application')
return parts, warning_stream
def HTML(src,
writer='html4css1',
report_level=1,
stylesheet='default.css',
input_encoding=default_input_encoding,
output_encoding=default_output_encoding,
language_code=default_lang,
initial_header_level = initial_header_level,
warnings = None,
settings = {}):
""" render HTML from a reStructuredText string
- 'src' -- string containing a valid reST document
- 'writer' -- docutils writer
- 'report_level' - verbosity of reST parser
- 'stylesheet' - Stylesheet to be used
- 'input_encoding' - encoding of the reST input string
- 'output_encoding' - encoding of the rendered HTML output
- 'report_level' - verbosity of reST parser
- 'language_code' - docutils language
- 'initial_header_level' - level of the first header tag
- 'warnings' - will be overwritten with a string containing the warnings
- 'settings' - dict of settings to pass in to Docutils, with priority
"""
parts, warning_stream = render(src,
writer = writer,
report_level = report_level,
stylesheet = stylesheet,
input_encoding = input_encoding,
output_encoding = output_encoding,
language_code=language_code,
initial_header_level = initial_header_level,
settings = settings)
header = '<h%(level)s class="title">%(title)s</h%(level)s>\n' % {
'level': initial_header_level,
'title': parts['title'],
}
body = '%(docinfo)s%(body)s' % {
'docinfo': parts['docinfo'],
'body': parts['body'],
}
if parts['title']:
output = header + body
else:
output = body
warnings = ''.join(warning_stream.messages)
return output.encode(output_encoding)
__all__ = ("HTML", 'render')
##############################################################################
#
# ZopeTestCase
#
# COPY THIS FILE TO YOUR 'tests' DIRECTORY.
#
# This version of framework.py will use the SOFTWARE_HOME
# environment variable to locate Zope and the Testing package.
#
# If the tests are run in an INSTANCE_HOME installation of Zope,
# Products.__path__ and sys.path with be adjusted to include the
# instance's Products and lib/python directories respectively.
#
# If you explicitly set INSTANCE_HOME prior to running the tests,
# auto-detection is disabled and the specified path will be used
# instead.
#
# If the 'tests' directory contains a custom_zodb.py file, INSTANCE_HOME
# will be adjusted to use it.
#
# If you set the ZEO_INSTANCE_HOME environment variable a ZEO setup
# is assumed, and you can attach to a running ZEO server (via the
# instance's custom_zodb.py).
#
##############################################################################
#
# The following code should be at the top of every test module:
#
# import os, sys
# if __name__ == '__main__':
# execfile(os.path.join(sys.path[0], 'framework.py'))
#
# ...and the following at the bottom:
#
# if __name__ == '__main__':
# framework()
#
##############################################################################
__version__ = '0.2.3'
# Save start state
#
__SOFTWARE_HOME = os.environ.get('SOFTWARE_HOME', '')
__INSTANCE_HOME = os.environ.get('INSTANCE_HOME', '')
if __SOFTWARE_HOME.endswith(os.sep):
__SOFTWARE_HOME = os.path.dirname(__SOFTWARE_HOME)
if __INSTANCE_HOME.endswith(os.sep):
__INSTANCE_HOME = os.path.dirname(__INSTANCE_HOME)
# Find and import the Testing package
#
if not sys.modules.has_key('Testing'):
p0 = sys.path[0]
if p0 and __name__ == '__main__':
os.chdir(p0)
p0 = ''
s = __SOFTWARE_HOME
p = d = s and s or os.getcwd()
while d:
if os.path.isdir(os.path.join(p, 'Testing')):
zope_home = os.path.dirname(os.path.dirname(p))
sys.path[:1] = [p0, p, zope_home]
break
p, d = s and ('','') or os.path.split(p)
else:
print 'Unable to locate Testing package.',
print 'You might need to set SOFTWARE_HOME.'
sys.exit(1)
import Testing, unittest
execfile(os.path.join(os.path.dirname(Testing.__file__), 'common.py'))
# Include ZopeTestCase support
#
if 1: # Create a new scope
p = os.path.join(os.path.dirname(Testing.__file__), 'ZopeTestCase')
if not os.path.isdir(p):
print 'Unable to locate ZopeTestCase package.',
print 'You might need to install ZopeTestCase.'
sys.exit(1)
ztc_common = 'ztc_common.py'
ztc_common_global = os.path.join(p, ztc_common)
f = 0
if os.path.exists(ztc_common_global):
execfile(ztc_common_global)
f = 1
if os.path.exists(ztc_common):
execfile(ztc_common)
f = 1
if not f:
print 'Unable to locate %s.' % ztc_common
sys.exit(1)
# Debug
#
print 'SOFTWARE_HOME: %s' % os.environ.get('SOFTWARE_HOME', 'Not set')
print 'INSTANCE_HOME: %s' % os.environ.get('INSTANCE_HOME', 'Not set')
sys.stdout.flush()
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE rss PUBLIC "-//Netscape Communications//DTD RSS 0.91//EN" "http://my.netscape.com/publish/formats/rss-0.91.dtd">
<rss version="0.91"><channel><title>Logilab.org news</title><language>en</language><item><title>xmltools 1.3.7</title><descr>bugfix in namespace handling</descr></item><item><title>Python-logic</title><descr>Set up of the Python-Logic special interest group</descr></item><item><title>PyReverse 0.2.3</title><descr>New features and bug fixes</descr></item><item><title>xmltools 1.3.6</title><descr>Uses the new APIs in pyxml-0.7 and 4Suite-0.12.0</descr></item><item><title>hmm-0.2</title><descr>New learning algorithms available</descr></item><item><title>Version 1.2a1 is out</title><descr>Overall refactoring of the engine. Backward incompatible changes
in the syntax of recipes and in modules, in order to ease product development.</descr></item><item><title>XMLdiff v0.5.3 (bug fixes)</title><descr>Version 0.5.3 fixes packaging bugs.</descr></item><item><title>hmm-0.1</title><descr>hmm is a module for Hidden Markov Model manipulation.</descr></item><item><title>PyReverse 0.1 (new product)</title><descr>
Beta release for this set of tools for reverse engineering python code
</descr></item><item><title>PyPaSax 0.3 (bug fixes)</title><descr>A few changes in the DTD, improved PyXML compatibility</descr></item><item><title>XMLdiff v0.5.2 (bug fixes)</title><descr>Version 0.5.2 fixes several bugs.</descr></item><item><title>Version 1.1 is out</title><descr>bugfixes over beta 3.</descr></item><item><title>Version 1.1b3 is out</title><descr>Great speed improvement for Horn. All-in-one windows installer.</descr></item><item><title>xmltools-1.3.5</title><descr>Version 1.3.5 code cleanup.</descr></item><item><title>xmltools-1.3.4</title><descr>Version 1.3.4 fixes a sever encoding bug that could cause crashes on windows machines.</descr></item><item><title>Version 1.1b1 is out</title><descr>Version 1.1b1 drops support for Python 1.5.2 in favor of Python 2.1, and features a new version of Horn, with localization support</descr></item><item><title>XMLdiff v0.5 (algorithm change, bug fixes)</title><descr>Version 0.5. The new algorithm makes it now usable either on big
documents and really faster in any cases. Fixes Unicode problem.</descr></item><item><title>XMLtools v1.3.1 (bugfixes)</title><descr>Version 1.3.1. This release fixes some minor glitches that had slipped in 1.3.</descr></item><item><title>XMLdiff v0.2 (performance improvement)</title><descr>Version 0.2. Huge performance improvement, and output cleanup.</descr></item><item><title>XMLdiff v0.1.1 (beta release)</title><descr>Version 0.1.1. Fully functionnal. Beta release.</descr></item><item><title>XPathVis v1.0beta (beta release)</title><descr>Version 1.0beta. Works nicely.</descr></item><item><title>XMLtools v1.3 (new features)</title><descr>Version 1.3. This release is compatible with Python 2.x and Unicode. It is not guaranteed to work with Python 1.5.2.</descr></item><item><title>Narval on developerWorks</title><descr>An Introduction to Narval was published on developerWorks.</descr><link>http://www-106.ibm.com/developerworks/library/l-ai/</link></item><item><title>Version 1.0.1 is out</title><descr>Version 1.0.1 is a bugfix release.</descr></item><item><title>Narval reviewed on AI.About.com</title><descr>AI.About.com published a review of Narval.</descr><link>http://ai.about.com/compute/ai/library/weekly/aa060801a.htm</link></item><item><title>Narval at BotShow 2001</title><descr>Narval was presented at the first BotShow event. The slides will soon be available online.</descr><link>http://www.ptolemee.com/botshow/text/text_fr/edito/edito_set.html</link></item><item><title>Version 1.0 is out</title><descr>Version 1.0. Celebration time, come on!</descr></item><item><title>Network-boot-HOWTO v0.2.1</title><descr>Version 0.2.1 is out.</descr></item><item><title>GuessLang v0.1.0 (beta release)</title><descr>Version 0.1.0 is out.</descr></item><item><title>Network-boot-HOWTO v0.1.1</title><descr>Version 0.1.1 is out.</descr></item><item><title>PyPaSax v0.1</title><descr>Version 0.1 is out.</descr></item><item><title>RC2 is out</title><descr>Release Candidate 2 is out. French documentation will be updated within a few days. We also released several applications (or maybe extension sets?) that are in alpha/beta stage. Give them a try!</descr></item><item><title>VCalSax v0.1 (beta)</title><descr>Version 0.1 is out. Still beta, but fully functional.</descr></item><item><title>Talk at LinuxExpo in English</title><descr>A translation of the talk we gave at Linux Expo 2001 is available on-line.</descr><link>http://www.logilab.com/press/linux-expo2001/</link></item><item><title>RC1 is out</title><descr>Release Candidate 1 is out. Documentation will be updated within a few days. Please help us test this one so that we can release 1.0 quicker.</descr></item><item><title>XMLtools v1.2 (stable release)</title><descr>Version 1.2 is released. Bugfixes, mainly..</descr></item><item><title>Application section on web site</title><descr>We just added a new applications section on Logilab.org web site.</descr><link>http://www.logilab.org/narval/app.html</link></item><item><title>WMgMon v0.4.0</title><descr>version 0.4.0 is out. Bugfixes and new monitor functions. </descr></item><item><title>XmlTools v1.1</title><descr>version 1.1 is out. New features in XmlTree.</descr></item><item><title>Beta5 is out</title><descr>Beta 5 is out. Lots of bugfixes in Narval and Horn, client server communication between the kernel and the graphical interface using SOAP. Windows specific bugfixes.</descr></item><item><title>XmlTools v1.0</title><descr>Initial release.</descr></item><item><title>PyGantt v0.6.0</title><descr>Version 0.6.0 released. New features added.</descr></item><item><title>Beta4 is out</title><descr>Beta 4 is out. Improved Windows compatibility. New features and bugfixes in both Narval and Horn.</descr></item><item><title>Article on Narval in Linux Gazette</title><descr>We published an article in the #59 issue of the Linux Gazette. It describes Narval and its use to set up Gazo, the assistant-coordinator for the translation of the Linux Gazette.</descr><link>http://www.linuxgazette.com/issue59/chauvat.html</link></item><item><title>Beta3 is out</title><descr>Beta 3 is out. No more memory leaks (almost). Time conditions work correctly. A step can be an XSL transform. Changes in the Narval DTD.</descr></item><item><title>Logilab invited at Linux Expo</title><descr>We got invited to give a talk at Linux Expo in Paris, France. The talk will be geared toward business uses of Narval. The title will be Using XML and Intelligent Personnal Assistants to enhance groupware and workflow enterprise applications. Come and meet with us!</descr><link>http://www.linuxexpoparis.com/EN/conferences</link></item><item><title>Beta2 is out</title><descr>Beta 2 is out. Installation is much easier. Tutorial. GUI improvements. Bugfixes.</descr></item><item><title>Beta1 is out</title><descr>Beta 1 is out. Many bug fixes.</descr></item><item><title>Beta0 is out</title><descr>Beta 0 is out. Logilab.org is one-line.</descr><link>http://www.logilab.org</link></item></channel></rss>
This is a test of the *reST* transform
o one
o two
o three
Heading 1
=========
Some text.
Heading 2
---------
Some text, bla ble bli blo blu. Yes, i know this is Stupid_.
.. _Stupid: http://www.example.com
=====
Title
=====
--------
Subtitle
--------
This is a test document to make sure subtitle gets the right heading.
Now the real heading
====================
The brown fox jumped over the lazy dog.
With a subheading
------------------
Some text, bla ble bli blo blu. Yes, i know this is Stupid_.
.. _Stupid: http://www.example.com
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:transform xmlns:xsl='http://www.w3.org/1999/XSL/Transform' version='1.0'>
<xsl:strip-space elements='*'/>
<xsl:output method='xml'/>
<!-- Narval prototype ====================================================== -->
<al:prototype xmlns:al="http://www.logilab.org/namespaces/Narval/1.2">
<al:description lang="fr">Transforme du RSS en du HTML.</al:description>
<al:description lang="en">Turns RSS into HTML.</al:description>
<al:input id="input"><al:match>rss</al:match></al:input>
<al:output id="output" list="yes"><al:match>html-body</al:match></al:output>
</al:prototype>
<!-- root ================================================================== -->
<xsl:template match='rss/rss/channel'>
<html-body>
<h2>
<xsl:value-of select='title'/>
</h2>
<p>
<xsl:element name='a'>
<xsl:attribute name='href'><xsl:value-of select='link'/></xsl:attribute>
<xsl:value-of select='title'/>
</xsl:element>
<em><xsl:value-of select='description'/></em>
</p>
<table>
<xsl:apply-templates select='item'/>
</table>
</html-body>
</xsl:template>
<xsl:template match='item'>
<tr>
<td>
<xsl:element name='a'>
<xsl:attribute name='href'><xsl:value-of select='link'/></xsl:attribute>
<xsl:value-of select='title'/>
</xsl:element>
<xsl:apply-templates mode='multi' select='description'/>
</td>
</tr>
</xsl:template>
</xsl:transform>
<?xml version="1.0" encoding="utf-8" ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="generator" content="Docutils 0.2.8: http://docutils.sourceforge.net/" />
<title>Copying Docutils</title>
<meta name="author" content="David Goodger" />
<meta name="date" content="2002-10-03" />
<link rel="stylesheet" href="tools/stylesheets/default.css" type="text/css" />
</head>
<body>
<div class="document" id="copying-docutils">
<h1 class="title">Copying Docutils</h1>
<table class="docinfo" frame="void" rules="none">
<col class="docinfo-name" />
<col class="docinfo-content" />
<tbody valign="top">
<tr><th class="docinfo-name">Author:</th>
<td>David Goodger</td></tr>
<tr><th class="docinfo-name">Contact:</th>
<td><a class="first last reference" href="mailto:goodger&#64;users.sourceforge.net">goodger&#64;users.sourceforge.net</a></td></tr>
<tr><th class="docinfo-name">Date:</th>
<td>2002-10-03</td></tr>
<tr class="field"><th class="docinfo-name">Web site:</th><td class="field-body"><a class="reference" href="http://docutils.sourceforge.net/">http://docutils.sourceforge.net/</a></td>
</tr>
</tbody>
</table>
<p>Most of the files included in this project are in the public domain,
and therefore have no license requirement and no restrictions on
copying or usage. The exceptions are:</p>
<ul class="simple">
<li>docutils/optik.py, copyright Gregory P. Ward, released under a
BSD-style license (which can be found in the module's source code).</li>
<li>docutils/roman.py, copyright by Mark Pilgrim, released under the
<a class="reference" href="http://www.python.org/2.1.1/license.html">Python 2.1.1 license</a>.</li>
<li>test/difflib.py, copyright by the Python Software Foundation,
released under the <a class="reference" href="http://www.python.org/2.2/license.html">Python 2.2 license</a>. This file is included for
compatibility with Python versions less than 2.2; if you have Python
2.2 or higher, difflib.py is not needed and may be removed. (It's
only used to report test failures anyhow; it isn't installed
anywhere. The included file is a pre-generator version of the
difflib.py module included in Python 2.2.)</li>
</ul>
<p>(Disclaimer: I am not a lawyer.) Both the BSD license and the Python
license are <a class="reference" href="http://opensource.org/licenses/">OSI-approved</a> and <a class="reference" href="http://www.gnu.org/philosophy/license-list.html">GPL-compatible</a>. Although complicated
by multiple owners and lots of legalese, the Python license basically
lets you copy, use, modify, and redistribute files as long as you keep
the copyright attribution intact, note any changes you make, and don't
use the owner's name in vain. The BSD license is similar.</p>
</div>
<hr class="footer"/>
<div class="footer">
Generated on: 2003-04-19 15:32 UTC.
Generated by <a class="reference" href="http://docutils.sourceforge.net/">Docutils</a> from <a class="reference" href="http://docutils.sourceforge.net/rst.html">reStructuredText</a> source.
</div>
</body>
</html>
""" nice docstring """
class A : pass
# comment
def inc(i):
return i+1
def greater(a, b):
"""foo <html />"""
return a > b
<?xml version="1.0" encoding="utf-8" ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Test page for save html rendering</title>
<meta name="date" content="2005-07-22" />
</head>
<body>
<h1>Test page</h1>
<table>
<tr>
<th>Test1</th>
<td>test2</td>
</tr>
</table>
<p>This is a text used as a blind text.</p>
<ul>
<li>A sample list item1</li>
<li>A sample list item2</li>
</ul>
<p>This is again a blind text with a<br>line break.</p>
<div>
Can we <q>quote</q> or write something we <del>didn't</del> mean to write? Or how is <ins>this</ins> instead?
</div>
<hr>
<div>
<a href="http://www.plone.org"><img src="http://www.plone.org/logo.jpg"/></a> is just great.
</div>
</body>
</html>
<A name=1></a>Chapter 44<br>
Writing Basic Unit Tests<br>
Difficulty<br>
Newcomer<br>
Skills<br>
• All you need to know is some Python.<br>
Problem/Task<br>
As you know by now, Zope 3 gains its incredible stability from testing any code in great detail. The<br>currently most common method is to write unit tests. This chapter introduces unit tests – which<br>are Zope 3 independent – and introduces some of the subtleties.<br>
Solution<br>
44.1<br>
Implementing the Sample Class<br>
Before we can write tests, we have to write some code that we can test. Here, we will implement<br>a simple class called Sample with a public attribute title and description that is accessed<br>via getDescription() and mutated using setDescription(). Further, the description must be<br>either a regular or unicode string.<br>
Since this code will not depend on Zope, open a file named test sample.py anywhere and add<br>
the following class:<br>
1 Sample(object):<br>
2<br>
&quot;&quot;&quot;A trivial Sample object.&quot;&quot;&quot;<br>
3<br>
4<br>
title = None<br>
5<br>
6<br>
def __init__(self):<br>
7<br>
&quot;&quot;&quot;Initialize object.&quot;&quot;&quot;<br>
8<br>
self._description = ’’<br>
9<br>
1<br>
<hr>
<A name=2></a>2<br>
CHAPTER 44. WRITING BASIC UNIT TESTS<br>
10<br>
def setDescription(self, value):<br>
11<br>
&quot;&quot;&quot;Change the value of the description.&quot;&quot;&quot;<br>
12<br>
assert isinstance(value, (str, unicode))<br>
13<br>
self._description = value<br>
14<br>
15<br>
def getDescription(self):<br>
16<br>
&quot;&quot;&quot;Change the value of the description.&quot;&quot;&quot;<br>
17<br>
return self._description<br>
Line 4: The title is just publicly declared and a value of None is given. Therefore this is just<br>a regular attribute.<br>
Line 8: The actual description string will be stored in description.<br>
Line 12: Make sure that the description is only a regular or unicode string, like it was stated in<br>the requirements.<br>
If you wish you can now manually test the class with the interactive Python shell. Just start<br>
Python by entering python in your shell prompt. Note that you should be in the directory in<br>which test sample.py is located when starting Python (an alternative is of course to specify the<br>directory in your PYTHONPATH.)<br>
1 &gt;&gt;&gt; from test_sample import Sample<br>2 &gt;&gt;&gt; sample = Sample()<br>
3 &gt;&gt;&gt; print sample.title<br>4 None<br>
5 &gt;&gt;&gt; sample.title = ’Title’<br>
6 &gt;&gt;&gt; print sample.title<br>7 Title<br>
8 &gt;&gt;&gt; print sample.getDescription()<br>9<br>
10 &gt;&gt;&gt; sample.setDescription(’Hello World’)<br>
11 &gt;&gt;&gt; print sample.getDescription()<br>12 Hello World<br>
13 &gt;&gt;&gt; sample.setDescription(None)<br>
14 Traceback (most recent call last):<br>
15<br>
File &quot;&lt;stdin&gt;&quot;, line 1, in ?<br>
16<br>
File &quot;test_sample.py&quot;, line 31, in setDescription<br>
17<br>
assert isinstance(value, (str, unicode))<br>
18 AssertionError<br>
As you can see in the last test, non-string object types are not allowed as descriptions and an<br>
AssertionError is raised.<br>
44.2<br>
Writing the Unit Tests<br>
The goal of writing the unit tests is to convert this informal, manual, and interactive testing session<br>into a formal test class. Python provides already a module called unittest for this purpose, which<br>is a port of the Java-based unit testing product, JUnit, by Kent Beck and Erich Gamma. There are<br>three levels to the testing framework (this list deviates a bit from the original definitions as found<br>in the Python library documentation. 1).<br>
1 http://www.python.org/doc/current/lib/module-unittest.html<br>
<hr>
<A name=3></a>44.2. WRITING THE UNIT TESTS<br>
3<br>
The smallest unit is obviously the “test”, which is a single method in a TestCase class that<br>
tests the behavior of a small piece of code or a particular aspect of an implementation. The “test<br>case” is then a collection tests that share the same setup/inputs. On top of all of this sits the “test<br>suite” which is a collection of test cases and/or other test suites. Test suites combine tests that<br>should be executed together. With the correct setup (as shown in the example below), you can<br>then execute test suites. For large projects like Zope 3, it is useful to know that there is also the<br>concept of a test runner, which manages the test run of all or a set of tests. The runner provides<br>useful feedback to the application, so that various user interaces can be developed on top of it.<br>
But enough about the theory. In the following example, which you can simply put into the same<br>
file as your code above, you will see a test in common Zope 3 style.<br>
1 import unittest<br>2<br>
3 class SampleTest(unittest.TestCase):<br>4<br>
&quot;&quot;&quot;Test the Sample class&quot;&quot;&quot;<br>
5<br>
6<br>
def test_title(self):<br>
7<br>
sample = Sample()<br>
8<br>
self.assertEqual(sample.title, None)<br>
9<br>
sample.title = ’Sample Title’<br>
10<br>
self.assertEqual(sample.title, ’Sample Title’)<br>
11<br>
12<br>
def test_getDescription(self):<br>
13<br>
sample = Sample()<br>
14<br>
self.assertEqual(sample.getDescription(), ’’)<br>
15<br>
sample._description = &quot;Description&quot;<br>
16<br>
self.assertEqual(sample.getDescription(), ’Description’)<br>
17<br>
18<br>
def test_setDescription(self):<br>
19<br>
sample = Sample()<br>
20<br>
self.assertEqual(sample._description, ’’)<br>
21<br>
sample.setDescription(’Description’)<br>
22<br>
self.assertEqual(sample._description, ’Description’)<br>
23<br>
sample.setDescription(u’Description2’)<br>
24<br>
self.assertEqual(sample._description, u’Description2’)<br>
25<br>
self.assertRaises(AssertionError, sample.setDescription, None)<br>
26<br>
27<br>
28 def test_suite():<br>29<br>
return unittest.TestSuite((<br>
30<br>
unittest.makeSuite(SampleTest),<br>
31<br>
))<br>
32<br>
33 if __name__ == ’__main__’:<br>34<br>
unittest.main(defaultTest=’test_suite’)<br>
Line 3–4: We usually develop test classes which must inherit from TestCase. While often not<br>done, it is a good idea to give the class a meaningful docstring that describes the purpose of the<br>tests it includes.<br>
Line 6, 12 &amp; 18: When a test case is run, a method called runTests() is executed. While it<br>is possible to overrride this method to run tests differently, the default option will look for any<br>method whose name starts with test and execute it as a single test. This way we can create<br>a “test method” for each aspect, method, function or property of the code to be tested. This<br>default is very sensible and is used everywhere in Zope 3.<br>
<hr>
<A name=4></a>4<br>
CHAPTER 44. WRITING BASIC UNIT TESTS<br>
Note that there is no docstring for test methods. This is intentional. If a docstring is specified,<br>it is used instead of the method name to identify the test. When specifying a docstring, we have<br>noticed that it is very difficult to identify the test later; therefore the method name is a much<br>better choice.<br>
Line 8, 10, 14, . . . : The TestCase class implements a handful of methods that aid you with the<br>testing. Here are some of the most frequently used ones. For a complete list see the standard<br>Python documentation referenced above.<br>
• assertEqual(first,second[,msg])<br>
Checks whether the first and second value are equal. If the test fails, the msg or None<br>is returned.<br>
• assertNotEqual(first,second[,msg])<br>
This is simply the opposite to assertEqual() by checking for non-equality.<br>
• assertRaises(exception,callable,...)<br>
You expect the callable to raise exception when executed. After the callable you can<br>specify any amount of positional and keyword arguments for the callable. If you expect<br>a group of exceptions from the execution, you can make exception a tuple of possible<br>exceptions.<br>
• assert (expr[,msg])<br>
Assert checks whether the specified expression executes correctly. If not, the test fails and<br>msg or None is returned.<br>
• failUnlessEqual()<br>
This testing method is equivalent to assertEqual().<br>
• failUnless(expr[,msg])<br>
This method is equivalent to assert (expr[,msg]).<br>
• failif()<br>
This is the opposite to failUnless().<br>
• fail([msg])<br>
Fails the running test without any evaluation. This is commonly used when testing various<br>possible execution paths at once and you would like to signify a failure if an improper path<br>was taken.<br>
Line 6–10: This method tests the title attribute of the Sample class. The first test should<br>be of course that the attribute exists and has the expected initial value (line 8). Then the title<br>attribute is changed and we check whether the value was really stored. This might seem like<br>overkill, but later you might change the title in a way that it uses properties instead. Then it<br>becomes very important to check whether this test still passes.<br>
Line 12–16: First we simply check that getDescription() returns the correct default value.<br>Since we do not want to use other API calls like setDescription() we set a new value of the<br>description via the implementation-internal description attribute (line 15). This is okay! Unit<br>tests can make use of implementation-specific attributes and methods. Finally we just check that<br>the correct value is returned.<br>
<hr>
<A name=5></a>44.3. RUNNING THE TESTS<br>
5<br>
Line 18–25: On line 21–24 it is checked that both regular and unicode strings are set correctly.<br>In the last line of the test we make sure that no other type of objects can be set as a description<br>and that an error is raised.<br>
28–31: This method returns a test suite that includes all test cases created in this module. It is<br>used by the Zope 3 test runner when it picks up all available tests. You would basically add the<br>line unittest.makeSuite(TestCaseClass) for each additional test case.<br>
33–34: In order to make the test module runnable by itself, you can execute unittest.main()<br>when the module is run.<br>
44.3<br>
Running the Tests<br>
You can run the test by simply calling pythontest sample.py from the directory you saved the<br>file in. Here is the result you should see:<br>
.<br>--------------------------------------------------------------------<br>n 3 tests in 0.001s<br>
The three dots represent the three tests that were run. If a test had failed, it would have been<br>
reported pointing out the failing test and providing a small traceback.<br>
When using the default Zope 3 test runner, tests will be picked up as long as they follow some<br>
conventions.<br>
• The tests must either be in a package or be a module called tests.<br>
• If tests is a package, then all test modules inside must also have a name starting with test,<br>
as it is the case with our name test sample.py.<br>
• The test module must be somewhere in the Zope 3 source tree, since the test runner looks<br>
only for files there.<br>
In our case, you could simply create a tests package in ZOPE3/src (do not forget the<br>
init .<br>
py file). Then place the test sample.py file into this directory.<br>
You you can use the test runner to run only the sample tests as follows from the Zope 3 root<br>
directory:<br>
python test.py -vp tests.test_sample<br>
The -v option stands for verbose mode, so that detailed information about a test failure is<br>
provided. The -p option enables a progress bar that tells you how many tests out of all have been<br>completed. There are many more options that can be specified. You can get a full list of them with<br>the option -h: pythontest.py-h.<br>
The output of the call above is as follows:<br>
nfiguration file found.<br>nning UNIT tests at level 1<br>nning UNIT tests from /opt/zope/Zope3<br>
3/3 (100.0%): test_title (tests.test_sample.SampleTest)<br>
--------------------------------------------------------------------<br>n 3 tests in 0.002s<br>
<hr>
<A name=6></a>6<br>
CHAPTER 44. WRITING BASIC UNIT TESTS<br>
nning FUNCTIONAL tests at level 1<br>nning FUNCTIONAL tests from /opt/zope/Zope3<br>
--------------------------------------------------------------------<br>n 0 tests in 0.000s<br>
Line 1: The test runner uses a configuration file for some setup. This allows developers to use<br>the test runner for other projects as well. This message simply tells us that the configuration file<br>was found.<br>
Line 2–8: The unit tests are run. On line 4 you can see the progress bar.<br>
Line 9–15: The functional tests are run, since the default test runner runs both types of tests.<br>Since we do not have any functional tests in the specified module, there are no tests to run. To<br>just run the unit tests, use option -u and -f for just running the functional tests. See “Writing<br>Functional Tests” for more detials on functional tests.<br>
<hr>
<A name=7></a>44.3. RUNNING THE TESTS<br>
7<br>
Exercises<br>
1. It is not very common to do the setup – in our case sample=Sample() – in every test<br>
method. Instead there exists a method called setUp() and its counterpart tearDown that<br>are run before and after each test, respectively. Change the test code above, so that it uses<br>the setUp() method. In later chapters and the rest of the book we will frequently use this<br>method of setting up tests.<br>
2. Currently the test setDescription() test only verifies that None is not allowed as input<br>
value.<br>
(a) Improve the test, so that all other builtin types are tested as well.<br>
(b) Also, make sure that any objects inheriting from str or unicode pass as valid values.<br>
<hr>
\ No newline at end of file
<A name=1></a>Chapter 44<br>
Writing Basic Unit Tests<br>
Difficulty<br>
Newcomer<br>
Skills<br>
• All you need to know is some Python.<br>
Problem/Task<br>
As you know by now, Zope 3 gains its incredible stability from testing any code in great detail. The<br>currently most common method is to write unit tests. This chapter introduces unit tests – which<br>are Zope 3 independent – and introduces some of the subtleties.<br>
Solution<br>
44.1<br>
Implementing the Sample Class<br>
Before we can write tests, we have to write some code that we can test. Here, we will implement<br>a simple class called Sample with a public attribute title and description that is accessed<br>via getDescription() and mutated using setDescription(). Further, the description must be<br>either a regular or unicode string.<br>
Since this code will not depend on Zope, open a file named test sample.py anywhere and add<br>
the following class:<br>
1 Sample(object):<br>
2<br>
&quot;&quot;&quot;A trivial Sample object.&quot;&quot;&quot;<br>
3<br>
4<br>
title = None<br>
5<br>
6<br>
def __init__(self):<br>
7<br>
&quot;&quot;&quot;Initialize object.&quot;&quot;&quot;<br>
8<br>
self._description = ’’<br>
9<br>
1<br>
<hr>
<A name=2></a>2<br>
CHAPTER 44. WRITING BASIC UNIT TESTS<br>
10<br>
def setDescription(self, value):<br>
11<br>
&quot;&quot;&quot;Change the value of the description.&quot;&quot;&quot;<br>
12<br>
assert isinstance(value, (str, unicode))<br>
13<br>
self._description = value<br>
14<br>
15<br>
def getDescription(self):<br>
16<br>
&quot;&quot;&quot;Change the value of the description.&quot;&quot;&quot;<br>
17<br>
return self._description<br>
Line 4: The title is just publicly declared and a value of None is given. Therefore this is just<br>a regular attribute.<br>
Line 8: The actual description string will be stored in description.<br>
Line 12: Make sure that the description is only a regular or unicode string, like it was stated in<br>the requirements.<br>
If you wish you can now manually test the class with the interactive Python shell. Just start<br>
Python by entering python in your shell prompt. Note that you should be in the directory in<br>which test sample.py is located when starting Python (an alternative is of course to specify the<br>directory in your PYTHONPATH.)<br>
1 &gt;&gt;&gt; from test_sample import Sample<br>2 &gt;&gt;&gt; sample = Sample()<br>
3 &gt;&gt;&gt; print sample.title<br>4 None<br>
5 &gt;&gt;&gt; sample.title = ’Title’<br>
6 &gt;&gt;&gt; print sample.title<br>7 Title<br>
8 &gt;&gt;&gt; print sample.getDescription()<br>9<br>
10 &gt;&gt;&gt; sample.setDescription(’Hello World’)<br>
11 &gt;&gt;&gt; print sample.getDescription()<br>12 Hello World<br>
13 &gt;&gt;&gt; sample.setDescription(None)<br>
14 Traceback (most recent call last):<br>
15<br>
File &quot;&lt;stdin&gt;&quot;, line 1, in ?<br>
16<br>
File &quot;test_sample.py&quot;, line 31, in setDescription<br>
17<br>
assert isinstance(value, (str, unicode))<br>
18 AssertionError<br>
As you can see in the last test, non-string object types are not allowed as descriptions and an<br>
AssertionError is raised.<br>
44.2<br>
Writing the Unit Tests<br>
The goal of writing the unit tests is to convert this informal, manual, and interactive testing session<br>into a formal test class. Python provides already a module called unittest for this purpose, which<br>is a port of the Java-based unit testing product, JUnit, by Kent Beck and Erich Gamma. There are<br>three levels to the testing framework (this list deviates a bit from the original definitions as found<br>in the Python library documentation. 1).<br>
1 http://www.python.org/doc/current/lib/module-unittest.html<br>
<hr>
<A name=3></a>44.2. WRITING THE UNIT TESTS<br>
3<br>
The smallest unit is obviously the “test”, which is a single method in a TestCase class that<br>
tests the behavior of a small piece of code or a particular aspect of an implementation. The “test<br>case” is then a collection tests that share the same setup/inputs. On top of all of this sits the “test<br>suite” which is a collection of test cases and/or other test suites. Test suites combine tests that<br>should be executed together. With the correct setup (as shown in the example below), you can<br>then execute test suites. For large projects like Zope 3, it is useful to know that there is also the<br>concept of a test runner, which manages the test run of all or a set of tests. The runner provides<br>useful feedback to the application, so that various user interaces can be developed on top of it.<br>
But enough about the theory. In the following example, which you can simply put into the same<br>
file as your code above, you will see a test in common Zope 3 style.<br>
1 import unittest<br>2<br>
3 class SampleTest(unittest.TestCase):<br>4<br>
&quot;&quot;&quot;Test the Sample class&quot;&quot;&quot;<br>
5<br>
6<br>
def test_title(self):<br>
7<br>
sample = Sample()<br>
8<br>
self.assertEqual(sample.title, None)<br>
9<br>
sample.title = ’Sample Title’<br>
10<br>
self.assertEqual(sample.title, ’Sample Title’)<br>
11<br>
12<br>
def test_getDescription(self):<br>
13<br>
sample = Sample()<br>
14<br>
self.assertEqual(sample.getDescription(), ’’)<br>
15<br>
sample._description = &quot;Description&quot;<br>
16<br>
self.assertEqual(sample.getDescription(), ’Description’)<br>
17<br>
18<br>
def test_setDescription(self):<br>
19<br>
sample = Sample()<br>
20<br>
self.assertEqual(sample._description, ’’)<br>
21<br>
sample.setDescription(’Description’)<br>
22<br>
self.assertEqual(sample._description, ’Description’)<br>
23<br>
sample.setDescription(u’Description2’)<br>
24<br>
self.assertEqual(sample._description, u’Description2’)<br>
25<br>
self.assertRaises(AssertionError, sample.setDescription, None)<br>
26<br>
27<br>
28 def test_suite():<br>29<br>
return unittest.TestSuite((<br>
30<br>
unittest.makeSuite(SampleTest),<br>
31<br>
))<br>
32<br>
33 if __name__ == ’__main__’:<br>34<br>
unittest.main(defaultTest=’test_suite’)<br>
Line 3–4: We usually develop test classes which must inherit from TestCase. While often not<br>done, it is a good idea to give the class a meaningful docstring that describes the purpose of the<br>tests it includes.<br>
Line 6, 12 &amp; 18: When a test case is run, a method called runTests() is executed. While it<br>is possible to overrride this method to run tests differently, the default option will look for any<br>method whose name starts with test and execute it as a single test. This way we can create<br>a “test method” for each aspect, method, function or property of the code to be tested. This<br>default is very sensible and is used everywhere in Zope 3.<br>
<hr>
<A name=4></a>4<br>
CHAPTER 44. WRITING BASIC UNIT TESTS<br>
Note that there is no docstring for test methods. This is intentional. If a docstring is specified,<br>it is used instead of the method name to identify the test. When specifying a docstring, we have<br>noticed that it is very difficult to identify the test later; therefore the method name is a much<br>better choice.<br>
Line 8, 10, 14, . . . : The TestCase class implements a handful of methods that aid you with the<br>testing. Here are some of the most frequently used ones. For a complete list see the standard<br>Python documentation referenced above.<br>
• assertEqual(first,second[,msg])<br>
Checks whether the first and second value are equal. If the test fails, the msg or None<br>is returned.<br>
• assertNotEqual(first,second[,msg])<br>
This is simply the opposite to assertEqual() by checking for non-equality.<br>
• assertRaises(exception,callable,...)<br>
You expect the callable to raise exception when executed. After the callable you can<br>specify any amount of positional and keyword arguments for the callable. If you expect<br>a group of exceptions from the execution, you can make exception a tuple of possible<br>exceptions.<br>
• assert (expr[,msg])<br>
Assert checks whether the specified expression executes correctly. If not, the test fails and<br>msg or None is returned.<br>
• failUnlessEqual()<br>
This testing method is equivalent to assertEqual().<br>
• failUnless(expr[,msg])<br>
This method is equivalent to assert (expr[,msg]).<br>
• failif()<br>
This is the opposite to failUnless().<br>
• fail([msg])<br>
Fails the running test without any evaluation. This is commonly used when testing various<br>possible execution paths at once and you would like to signify a failure if an improper path<br>was taken.<br>
Line 6–10: This method tests the title attribute of the Sample class. The first test should<br>be of course that the attribute exists and has the expected initial value (line 8). Then the title<br>attribute is changed and we check whether the value was really stored. This might seem like<br>overkill, but later you might change the title in a way that it uses properties instead. Then it<br>becomes very important to check whether this test still passes.<br>
Line 12–16: First we simply check that getDescription() returns the correct default value.<br>Since we do not want to use other API calls like setDescription() we set a new value of the<br>description via the implementation-internal description attribute (line 15). This is okay! Unit<br>tests can make use of implementation-specific attributes and methods. Finally we just check that<br>the correct value is returned.<br>
<hr>
<A name=5></a>44.3. RUNNING THE TESTS<br>
5<br>
Line 18–25: On line 21–24 it is checked that both regular and unicode strings are set correctly.<br>In the last line of the test we make sure that no other type of objects can be set as a description<br>and that an error is raised.<br>
28–31: This method returns a test suite that includes all test cases created in this module. It is<br>used by the Zope 3 test runner when it picks up all available tests. You would basically add the<br>line unittest.makeSuite(TestCaseClass) for each additional test case.<br>
33–34: In order to make the test module runnable by itself, you can execute unittest.main()<br>when the module is run.<br>
44.3<br>
Running the Tests<br>
You can run the test by simply calling pythontest sample.py from the directory you saved the<br>file in. Here is the result you should see:<br>
.<br>--------------------------------------------------------------------<br>n 3 tests in 0.001s<br>
The three dots represent the three tests that were run. If a test had failed, it would have been<br>
reported pointing out the failing test and providing a small traceback.<br>
When using the default Zope 3 test runner, tests will be picked up as long as they follow some<br>
conventions.<br>
• The tests must either be in a package or be a module called tests.<br>
• If tests is a package, then all test modules inside must also have a name starting with test,<br>
as it is the case with our name test sample.py.<br>
• The test module must be somewhere in the Zope 3 source tree, since the test runner looks<br>
only for files there.<br>
In our case, you could simply create a tests package in ZOPE3/src (do not forget the<br>
init .<br>
py file). Then place the test sample.py file into this directory.<br>
You you can use the test runner to run only the sample tests as follows from the Zope 3 root<br>
directory:<br>
python test.py -vp tests.test_sample<br>
The -v option stands for verbose mode, so that detailed information about a test failure is<br>
provided. The -p option enables a progress bar that tells you how many tests out of all have been<br>completed. There are many more options that can be specified. You can get a full list of them with<br>the option -h: pythontest.py-h.<br>
The output of the call above is as follows:<br>
nfiguration file found.<br>nning UNIT tests at level 1<br>nning UNIT tests from /opt/zope/Zope3<br>
3/3 (100.0%): test_title (tests.test_sample.SampleTest)<br>
--------------------------------------------------------------------<br>n 3 tests in 0.002s<br>
<hr>
<A name=6></a>6<br>
CHAPTER 44. WRITING BASIC UNIT TESTS<br>
nning FUNCTIONAL tests at level 1<br>nning FUNCTIONAL tests from /opt/zope/Zope3<br>
--------------------------------------------------------------------<br>n 0 tests in 0.000s<br>
Line 1: The test runner uses a configuration file for some setup. This allows developers to use<br>the test runner for other projects as well. This message simply tells us that the configuration file<br>was found.<br>
Line 2–8: The unit tests are run. On line 4 you can see the progress bar.<br>
Line 9–15: The functional tests are run, since the default test runner runs both types of tests.<br>Since we do not have any functional tests in the specified module, there are no tests to run. To<br>just run the unit tests, use option -u and -f for just running the functional tests. See “Writing<br>Functional Tests” for more detials on functional tests.<br>
<hr>
<A name=7></a>44.3. RUNNING THE TESTS<br>
7<br>
Exercises<br>
1. It is not very common to do the setup – in our case sample=Sample() – in every test<br>
method. Instead there exists a method called setUp() and its counterpart tearDown that<br>are run before and after each test, respectively. Change the test code above, so that it uses<br>the setUp() method. In later chapters and the rest of the book we will frequently use this<br>method of setting up tests.<br>
2. Currently the test setDescription() test only verifies that None is not allowed as input<br>
value.<br>
(a) Improve the test, so that all other builtin types are tested as well.<br>
(b) Also, make sure that any objects inheriting from str or unicode pass as valid values.<br>
<hr>
\ No newline at end of file
P6
24 23
255
̙̙
\ No newline at end of file
<?xml version="1.0"?>
Logilab.org newsen<tr><td><a href="">xmltools 1.3.7</a></td></tr><tr><td><a href="">Python-logic</a></td></tr><tr><td><a href="">PyReverse 0.2.3</a></td></tr><tr><td><a href="">xmltools 1.3.6</a></td></tr><tr><td><a href="">hmm-0.2</a></td></tr><tr><td><a href="">Version 1.2a1 is out</a></td></tr><tr><td><a href="">XMLdiff v0.5.3 (bug fixes)</a></td></tr><tr><td><a href="">hmm-0.1</a></td></tr><tr><td><a href="">PyReverse 0.1 (new product)</a></td></tr><tr><td><a href="">PyPaSax 0.3 (bug fixes)</a></td></tr><tr><td><a href="">XMLdiff v0.5.2 (bug fixes)</a></td></tr><tr><td><a href="">Version 1.1 is out</a></td></tr><tr><td><a href="">Version 1.1b3 is out</a></td></tr><tr><td><a href="">xmltools-1.3.5</a></td></tr><tr><td><a href="">xmltools-1.3.4</a></td></tr><tr><td><a href="">Version 1.1b1 is out</a></td></tr><tr><td><a href="">XMLdiff v0.5 (algorithm change, bug fixes)</a></td></tr><tr><td><a href="">XMLtools v1.3.1 (bugfixes)</a></td></tr><tr><td><a href="">XMLdiff v0.2 (performance improvement)</a></td></tr><tr><td><a href="">XMLdiff v0.1.1 (beta release)</a></td></tr><tr><td><a href="">XPathVis v1.0beta (beta release)</a></td></tr><tr><td><a href="">XMLtools v1.3 (new features)</a></td></tr><tr><td><a href="http://www-106.ibm.com/developerworks/library/l-ai/">Narval on developerWorks</a></td></tr><tr><td><a href="">Version 1.0.1 is out</a></td></tr><tr><td><a href="http://ai.about.com/compute/ai/library/weekly/aa060801a.htm">Narval reviewed on AI.About.com</a></td></tr><tr><td><a href="http://www.ptolemee.com/botshow/text/text_fr/edito/edito_set.html">Narval at BotShow 2001</a></td></tr><tr><td><a href="">Version 1.0 is out</a></td></tr><tr><td><a href="">Network-boot-HOWTO v0.2.1</a></td></tr><tr><td><a href="">GuessLang v0.1.0 (beta release)</a></td></tr><tr><td><a href="">Network-boot-HOWTO v0.1.1</a></td></tr><tr><td><a href="">PyPaSax v0.1</a></td></tr><tr><td><a href="">RC2 is out</a></td></tr><tr><td><a href="">VCalSax v0.1 (beta)</a></td></tr><tr><td><a href="http://www.logilab.com/press/linux-expo2001/">Talk at LinuxExpo in English</a></td></tr><tr><td><a href="">RC1 is out</a></td></tr><tr><td><a href="">XMLtools v1.2 (stable release)</a></td></tr><tr><td><a href="http://www.logilab.org/narval/app.html">Application section on web site</a></td></tr><tr><td><a href="">WMgMon v0.4.0</a></td></tr><tr><td><a href="">XmlTools v1.1</a></td></tr><tr><td><a href="">Beta5 is out</a></td></tr><tr><td><a href="">XmlTools v1.0</a></td></tr><tr><td><a href="">PyGantt v0.6.0</a></td></tr><tr><td><a href="">Beta4 is out</a></td></tr><tr><td><a href="http://www.linuxgazette.com/issue59/chauvat.html">Article on Narval in Linux Gazette</a></td></tr><tr><td><a href="">Beta3 is out</a></td></tr><tr><td><a href="http://www.linuxexpoparis.com/EN/conferences">Logilab invited at Linux Expo</a></td></tr><tr><td><a href="">Beta2 is out</a></td></tr><tr><td><a href="">Beta1 is out</a></td></tr><tr><td><a href="http://www.logilab.org">Beta0 is out</a></td></tr>
<p>This is a test of the *reST* transform<br /> o one<br /> o two<br /> o three</p>
\ No newline at end of file
<dl class="docutils">
<dt>This is a test of the <em>reST</em> transform</dt>
<dd>o one
o two
o three</dd>
</dl>
This is a test of the *reST* transform
o one
o two
o three
<h2 class="title">Heading 1</h2>
<p>Some text.</p>
<div class="section" id="heading-2">
<h3><a name="heading-2">Heading 2</a></h3>
<p>Some text, bla ble bli blo blu. Yes, i know this is <a class="reference" href="http://www.example.com">Stupid</a>.</p>
</div>
<h2 class="title">Title</h2>
<h3 class="subtitle">Subtitle</h3>
<p>This is a test document to make sure subtitle gets the right heading.</p>
<div class="section" id="now-the-real-heading">
<h3><a name="now-the-real-heading">Now the real heading</a></h3>
<p>The brown fox jumped over the lazy dog.</p>
<div class="section" id="with-a-subheading">
<h4><a name="with-a-subheading">With a subheading</a></h4>
<p>Some text, bla ble bli blo blu. Yes, i know this is <a class="reference" href="http://www.example.com">Stupid</a>.</p>
</div>
</div>
Copying Docutils
Copying Docutils
Author:
David Goodger
Contact:
goodger&#64;users.sourceforge.net
Date:
2002-10-03
Web site:http://docutils.sourceforge.net/
Most of the files included in this project are in the public domain,
and therefore have no license requirement and no restrictions on
copying or usage. The exceptions are:
docutils/optik.py, copyright Gregory P. Ward, released under a
BSD-style license (which can be found in the module's source code).
docutils/roman.py, copyright by Mark Pilgrim, released under the
Python 2.1.1 license.
test/difflib.py, copyright by the Python Software Foundation,
released under the Python 2.2 license. This file is included for
compatibility with Python versions less than 2.2; if you have Python
2.2 or higher, difflib.py is not needed and may be removed. (It's
only used to report test failures anyhow; it isn't installed
anywhere. The included file is a pre-generator version of the
difflib.py module included in Python 2.2.)
(Disclaimer: I am not a lawyer.) Both the BSD license and the Python
license are OSI-approved and GPL-compatible. Although complicated
by multiple owners and lots of legalese, the Python license basically
lets you copy, use, modify, and redistribute files as long as you keep
the copyright attribution intact, note any changes you make, and don't
use the owner's name in vain. The BSD license is similar.
Generated on: 2003-04-19 15:32 UTC.
Generated by Docutils from reStructuredText source.
Copying Docutils
Author: David Goodger
Contact: goodger@users.sourceforge.net
Date: 2002-10-03
Web site: http://docutils.sourceforge.net/
Most of the files included in this project are in the public domain,
and therefore have no license requirement and no restrictions on
copying or usage. The exceptions are:
* docutils/optik.py, copyright Gregory P. Ward, released under a
BSD-style license (which can be found in the module's source
code).
* docutils/roman.py, copyright by Mark Pilgrim, released under the
Python 2.1.1 license.
* test/difflib.py, copyright by the Python Software Foundation,
released under the Python 2.2 license. This file is included for
compatibility with Python versions less than 2.2; if you have
Python 2.2 or higher, difflib.py is not needed and may be removed.
(It's only used to report test failures anyhow; it isn't installed
anywhere. The included file is a pre-generator version of the
difflib.py module included in Python 2.2.)
(Disclaimer: I am not a lawyer.) Both the BSD license and the Python
license are OSI-approved and GPL-compatible. Although complicated by
multiple owners and lots of legalese, the Python license basically
lets you copy, use, modify, and redistribute files as long as you keep
the copyright attribution intact, note any changes you make, and don't
use the owner's name in vain. The BSD license is similar.
_________________________________________________________________
Generated on: 2003-04-19 15:32 UTC. Generated by Docutils from
reStructuredText source.
<pre class="python">
<span style="color: #004080;">&quot;&quot;&quot; nice docstring &quot;&quot;&quot;</span>
<span style="color: #C00000;">class</span> <span style="color: #000000;">A</span> <span style="color: #0000C0;">:</span> <span style="color: #C00000;">pass</span>
<span style="color: #008000;"># comment
</span>
<span style="color: #C00000;">def</span> <span style="color: #000000;">inc</span><span style="color: #0000C0;">(</span><span style="color: #000000;">i</span><span style="color: #0000C0;">)</span><span style="color: #0000C0;">:</span>
<span style="color: #C00000;">return</span> <span style="color: #000000;">i</span><span style="color: #0000C0;">+</span><span style="color: #0080C0;">1</span>
<span style="color: #C00000;">def</span> <span style="color: #000000;">greater</span><span style="color: #0000C0;">(</span><span style="color: #000000;">a</span><span style="color: #0000C0;">,</span> <span style="color: #000000;">b</span><span style="color: #0000C0;">)</span><span style="color: #0000C0;">:</span>
<span style="color: #004080;">&quot;&quot;&quot;foo &lt;html /&gt;&quot;&quot;&quot;</span>
<span style="color: #C00000;">return</span> <span style="color: #000000;">a</span> <span style="color: #0000C0;">&gt;</span> <span style="color: #000000;">b</span>
</pre>
<h1>Test page</h1>
<table>
<tr>
<th>Test1</th>
<td>test2</td>
</tr>
</table>
<p>This is a text used as a blind text.</p>
<ul>
<li>A sample list item1</li>
<li>A sample list item2</li>
</ul>
<p>This is again a blind text with a<br />line break.</p>
<div>
Can we <q>quote</q> or write something we <del>didn't</del> mean to write? Or how is <ins>this</ins> instead?
</div>
<hr />
<div>
<a href="http://www.plone.org"><img src="http://www.plone.org/logo.jpg" /></a> is just great.
</div>
\ No newline at end of file
<br />
<p><div name="Default" align="left" style=" padding: 0.00mm 0.00mm 0.00mm 0.00mm; ">
<p style="text-indent: 0.00mm; text-align: left; line-height: 4.166667mm; color: Black; background-color: White; ">
how odd: blank named file in directory
</p></div>
#
# Runs all tests in the current directory
#
# Execute like:
# python runalltests.py
#
# Alternatively use the testrunner:
# python /path/to/Zope/utilities/testrunner.py -qa
#
import os, sys
if __name__ == '__main__':
execfile(os.path.join(sys.path[0], 'framework.py'))
import unittest
TestRunner = unittest.TextTestRunner
suite = unittest.TestSuite()
tests = os.listdir(os.curdir)
tests = [n[:-3] for n in tests if n.startswith('test') and n.endswith('.py')]
for test in tests:
m = __import__(test)
if hasattr(m, 'test_suite'):
suite.addTest(m.test_suite())
if __name__ == '__main__':
TestRunner().run(suite)
#!/bin/bash
#
# example test runner shell script
#
# full path to the python interpretor
export PYTHON="/usr/local/bin/python2.3"
# path to ZOPE_HOME/lib/python
export SOFTWARE_HOME="/opt/zope/releases/Zope-2_7-branch/lib/python"
# path to your instance. Don't set it if you aren't having instance
export INSTANCE_HOME="/opt/zope/instances/plone21/"
${PYTHON} runalltests.py
import os, sys
if __name__ == '__main__':
execfile(os.path.join(sys.path[0], 'framework.py'))
from Testing import ZopeTestCase
from Products.Archetypes.tests.atsitetestcase import ATSiteTestCase
from Products.PortalTransforms.utils import TransformException
from Products.PortalTransforms.interfaces import *
from Products.PortalTransforms.chain import chain
import urllib
import time
import re
class BaseTransform:
def name(self):
return getattr(self, '__name__', self.__class__.__name__)
class HtmlToText(BaseTransform):
__implements__ = itransform
inputs = ('text/html',)
output = 'text/plain'
def __call__(self, orig, **kwargs):
orig = re.sub('<[^>]*>(?i)(?m)', '', orig)
return urllib.unquote(re.sub('\n+', '\n', orig)).strip()
def convert(self, orig, data, **kwargs):
orig = self.__call__(orig)
data.setData(orig)
return data
class HtmlToTextWithEncoding(HtmlToText):
output_encoding = 'ascii'
class FooToBar(BaseTransform):
__implements__ = itransform
inputs = ('text/*',)
output = 'text/plain'
def __call__(self, orig, **kwargs):
orig = re.sub('foo', 'bar', orig)
return urllib.unquote(re.sub('\n+', '\n', orig)).strip()
def convert(self, orig, data, **kwargs):
orig = self.__call__(orig)
data.setData(orig)
return data
class TransformNoIO(BaseTransform):
__implements__ = itransform
class BadTransformMissingImplements(BaseTransform):
__implements__ = None
inputs = ('text/*',)
output = 'text/plain'
class BadTransformBadMIMEType1(BaseTransform):
__implements__ = itransform
inputs = ('truc/muche',)
output = 'text/plain'
class BadTransformBadMIMEType2(BaseTransform):
__implements__ = itransform
inputs = ('text/plain',)
output = 'truc/muche'
class BadTransformNoInput(BaseTransform):
__implements__ = itransform
inputs = ()
output = 'text/plain'
class BadTransformWildcardOutput(BaseTransform):
__implements__ = itransform
inputs = ('text/plain',)
output = 'text/*'
class TestEngine(ATSiteTestCase):
def afterSetUp(self):
ATSiteTestCase.afterSetUp(self)
self.engine = self.portal.portal_transforms
self.data = '<b>foo</b>'
def register(self):
#A default set of transforms to prove the interfaces work
self.engine.registerTransform(HtmlToText())
self.engine.registerTransform(FooToBar())
def testRegister(self):
self.register()
def testFailRegister(self):
register = self.engine.registerTransform
self.assertRaises(TransformException, register, TransformNoIO())
self.assertRaises(TransformException, register, BadTransformMissingImplements())
self.assertRaises(TransformException, register, BadTransformBadMIMEType1())
self.assertRaises(TransformException, register, BadTransformBadMIMEType2())
self.assertRaises(TransformException, register, BadTransformNoInput())
self.assertRaises(TransformException, register, BadTransformWildcardOutput())
def testCall(self):
self.register()
data = self.engine('HtmlToText', self.data)
self.failUnlessEqual(data, "foo")
data = self.engine('FooToBar', self.data)
self.failUnlessEqual(data, "<b>bar</b>")
def testConvert(self):
self.register()
data = self.engine.convert('HtmlToText', self.data)
self.failUnlessEqual(data.getData(), "foo")
self.failUnlessEqual(data.getMetadata()['mimetype'], 'text/plain')
self.failUnlessEqual(data.getMetadata().get('encoding'), None)
self.failUnlessEqual(data.name(), "HtmlToText")
self.engine.registerTransform(HtmlToTextWithEncoding())
data = self.engine.convert('HtmlToTextWithEncoding', self.data)
self.failUnlessEqual(data.getMetadata()['mimetype'], 'text/plain')
self.failUnlessEqual(data.getMetadata()['encoding'], 'ascii')
self.failUnlessEqual(data.name(), "HtmlToTextWithEncoding")
def testConvertTo(self):
self.register()
data = self.engine.convertTo('text/plain', self.data, mimetype="text/html")
self.failUnlessEqual(data.getData(), "foo")
self.failUnlessEqual(data.getMetadata()['mimetype'], 'text/plain')
self.failUnlessEqual(data.getMetadata().get('encoding'), None)
self.failUnlessEqual(data.name(), "text/plain")
self.engine.unregisterTransform('HtmlToText')
self.engine.unregisterTransform('FooToBar')
self.engine.registerTransform(HtmlToTextWithEncoding())
data = self.engine.convertTo('text/plain', self.data, mimetype="text/html")
self.failUnlessEqual(data.getMetadata()['mimetype'], 'text/plain')
# HtmlToTextWithEncoding. Now None is the right
#self.failUnlessEqual(data.getMetadata()['encoding'], 'ascii')
# XXX the new algorithm is choosing html_to_text instead of
self.failUnlessEqual(data.getMetadata()['encoding'], None)
self.failUnlessEqual(data.name(), "text/plain")
def testChain(self):
self.register()
hb = chain('hbar')
hb.registerTransform(HtmlToText())
hb.registerTransform(FooToBar())
self.engine.registerTransform(hb)
cache = self.engine.convert('hbar', self.data)
self.failUnlessEqual(cache.getData(), "bar")
self.failUnlessEqual(cache.name(), "hbar")
def testSame(self):
data = "This is a test"
mt = "text/plain"
out = self.engine.convertTo('text/plain', data, mimetype=mt)
self.failUnlessEqual(out.getData(), data)
self.failUnlessEqual(out.getMetadata()['mimetype'], 'text/plain')
def testCache(self):
data = "This is a test"
other_data = 'some different data'
mt = "text/plain"
self.engine.max_sec_in_cache = 20
out = self.engine.convertTo(mt, data, mimetype=mt, object=self)
self.failUnlessEqual(out.getData(), data, out.getData())
out = self.engine.convertTo(mt, other_data, mimetype=mt, object=self)
self.failUnlessEqual(out.getData(), data, out.getData())
self.engine.max_sec_in_cache = -1
out = self.engine.convertTo(mt, data, mimetype=mt, object=self)
self.failUnlessEqual(out.getData(), data, out.getData())
out = self.engine.convertTo(mt, other_data, mimetype=mt, object=self)
self.failUnlessEqual(out.getData(), other_data, out.getData())
def test_suite():
from unittest import TestSuite, makeSuite
suite = TestSuite()
suite.addTest(makeSuite(TestEngine))
return suite
if __name__ == '__main__':
framework()
import os, sys
if __name__ == '__main__':
execfile(os.path.join(sys.path[0], 'framework.py'))
from Testing import ZopeTestCase
from Products.Archetypes.tests.atsitetestcase import ATSiteTestCase
from utils import input_file_path
FILE_PATH = input_file_path("demo1.pdf")
class TestGraph(ATSiteTestCase):
def afterSetUp(self):
ATSiteTestCase.afterSetUp(self)
self.engine = self.portal.portal_transforms
def testGraph(self):
### XXX Local file and expected output
data = open(FILE_PATH, 'r').read()
out = self.engine.convertTo('text/plain', data, filename=FILE_PATH)
assert(out.getData())
def test_suite():
from unittest import TestSuite, makeSuite
suite = TestSuite()
suite.addTest(makeSuite(TestGraph))
return suite
if __name__ == '__main__':
framework()
from __future__ import nested_scopes
import os, sys
if __name__ == '__main__':
execfile(os.path.join(sys.path[0], 'framework.py'))
from Testing import ZopeTestCase
from Products.Archetypes.tests.atsitetestcase import ATSiteTestCase
from utils import input_file_path, output_file_path, normalize_html,\
load, matching_inputs
from Products.PortalTransforms.data import datastream
from Products.PortalTransforms.interfaces import idatastream
from Products.MimetypesRegistry.MimeTypesTool import MimeTypesTool
from Products.PortalTransforms.TransformEngine import TransformTool
from Products.PortalTransforms.libtransforms.utils import MissingBinary
from Products.PortalTransforms.transforms.image_to_gif import image_to_gif
from Products.PortalTransforms.transforms.image_to_png import image_to_png
from Products.PortalTransforms.transforms.image_to_jpeg import image_to_jpeg
from Products.PortalTransforms.transforms.image_to_bmp import image_to_bmp
from Products.PortalTransforms.transforms.image_to_tiff import image_to_tiff
from Products.PortalTransforms.transforms.image_to_ppm import image_to_ppm
from Products.PortalTransforms.transforms.image_to_pcx import image_to_pcx
from os.path import exists
import sys
# we have to set locale because lynx output is locale sensitive !
os.environ['LC_ALL'] = 'C'
class TransformTest(ATSiteTestCase):
def do_convert(self, filename=None):
if filename is None and exists(self.output + '.nofilename'):
output = self.output + '.nofilename'
else:
output = self.output
input = open(self.input)
orig = input.read()
input.close()
data = datastream(self.transform.name())
res_data = self.transform.convert(orig, data, filename=filename)
self.assert_(idatastream.isImplementedBy(res_data))
got = res_data.getData()
try:
output = open(output)
except IOError:
import sys
print >>sys.stderr, 'No output file found.'
print >>sys.stderr, 'File %s created, check it !' % self.output
output = open(output, 'w')
output.write(got)
output.close()
self.assert_(0)
expected = output.read()
if self.normalize is not None:
expected = self.normalize(expected)
got = self.normalize(got)
output.close()
self.assertEquals(got, expected,
'[%s]\n\n!=\n\n[%s]\n\nIN %s(%s)' % (
got, expected, self.transform.name(), self.input))
self.assertEquals(self.subobjects, len(res_data.getSubObjects()),
'%s\n\n!=\n\n%s\n\nIN %s(%s)' % (
self.subobjects, len(res_data.getSubObjects()), self.transform.name(), self.input))
def testSame(self):
self.do_convert(filename=self.input)
def testSameNoFilename(self):
self.do_convert()
def __repr__(self):
return self.transform.name()
class PILTransformsTest(ATSiteTestCase):
def afterSetUp(self):
ATSiteTestCase.afterSetUp(self)
self.pt = self.portal.portal_transforms
def test_image_to_bmp(self):
self.pt.registerTransform(image_to_bmp())
imgFile = open(input_file_path('logo.jpg'), 'rb')
data = imgFile.read()
self.failUnlessEqual(self.portal.mimetypes_registry.classify(data),'image/jpeg')
data = self.pt.convertTo(target_mimetype='image/x-ms-bmp',orig=data)
self.failUnlessEqual(data.getMetadata()['mimetype'], 'image/x-ms-bmp')
def test_image_to_gif(self):
self.pt.registerTransform(image_to_gif())
imgFile = open(input_file_path('logo.png'), 'rb')
data = imgFile.read()
self.failUnlessEqual(self.portal.mimetypes_registry.classify(data),'image/png')
data = self.pt.convertTo(target_mimetype='image/gif',orig=data)
self.failUnlessEqual(data.getMetadata()['mimetype'], 'image/gif')
def test_image_to_jpeg(self):
self.pt.registerTransform(image_to_jpeg())
imgFile = open(input_file_path('logo.gif'), 'rb')
data = imgFile.read()
self.failUnlessEqual(self.portal.mimetypes_registry.classify(data),'image/gif')
data = self.pt.convertTo(target_mimetype='image/jpeg',orig=data)
self.failUnlessEqual(data.getMetadata()['mimetype'], 'image/jpeg')
def test_image_to_png(self):
self.pt.registerTransform(image_to_png())
imgFile = open(input_file_path('logo.jpg'), 'rb')
data = imgFile.read()
self.failUnlessEqual(self.portal.mimetypes_registry.classify(data),'image/jpeg')
data = self.pt.convertTo(target_mimetype='image/png',orig=data)
self.failUnlessEqual(data.getMetadata()['mimetype'], 'image/png')
def test_image_to_pcx(self):
self.pt.registerTransform(image_to_pcx())
imgFile = open(input_file_path('logo.gif'), 'rb')
data = imgFile.read()
self.failUnlessEqual(self.portal.mimetypes_registry.classify(data),'image/gif')
data = self.pt.convertTo(target_mimetype='image/pcx',orig=data)
self.failUnlessEqual(data.getMetadata()['mimetype'], 'image/pcx')
def test_image_to_ppm(self):
self.pt.registerTransform(image_to_ppm())
imgFile = open(input_file_path('logo.png'), 'rb')
data = imgFile.read()
self.failUnlessEqual(self.portal.mimetypes_registry.classify(data),'image/png')
data = self.pt.convertTo(target_mimetype='image/x-portable-pixmap',orig=data)
self.failUnlessEqual(data.getMetadata()['mimetype'], 'image/x-portable-pixmap')
def test_image_to_tiff(self):
self.pt.registerTransform(image_to_tiff())
imgFile = open(input_file_path('logo.jpg'), 'rb')
data = imgFile.read()
self.failUnlessEqual(self.portal.mimetypes_registry.classify(data),'image/jpeg')
data = self.pt.convertTo(target_mimetype='image/tiff',orig=data)
self.failUnlessEqual(data.getMetadata()['mimetype'], 'image/tiff')
TRANSFORMS_TESTINFO = (
('Products.PortalTransforms.transforms.pdf_to_html',
"demo1.pdf", "demo1.html", None, 0
),
('Products.PortalTransforms.transforms.word_to_html',
"test.doc", "test_word.html", normalize_html, 0
),
('Products.PortalTransforms.transforms.lynx_dump',
"test_lynx.html", "test_lynx.txt", None, 0
),
('Products.PortalTransforms.transforms.html_to_text',
"test_lynx.html", "test_html_to_text.txt", None, 0
),
('Products.PortalTransforms.transforms.identity',
"rest1.rst", "rest1.rst", None, 0
),
('Products.PortalTransforms.transforms.text_to_html',
"rest1.rst", "rest1.html", None, 0
),
('Products.PortalTransforms.transforms.safe_html',
"test_safehtml.html", "test_safe.html", None, 0
),
('Products.PortalTransforms.transforms.image_to_bmp',
"logo.jpg", "logo.bmp", None, 0
),
('Products.PortalTransforms.transforms.image_to_gif',
"logo.bmp", "logo.gif", None, 0
),
('Products.PortalTransforms.transforms.image_to_jpeg',
"logo.gif", "logo.jpg", None, 0
),
('Products.PortalTransforms.transforms.image_to_png',
"logo.bmp", "logo.png", None, 0
),
('Products.PortalTransforms.transforms.image_to_ppm',
"logo.gif", "logo.ppm", None, 0
),
('Products.PortalTransforms.transforms.image_to_tiff',
"logo.png", "logo.tiff", None, 0
),
('Products.PortalTransforms.transforms.image_to_pcx',
"logo.png", "logo.pcx", None, 0
),
)
def initialise(transform, normalize, pattern):
global TRANSFORMS_TESTINFO
for fname in matching_inputs(pattern):
outname = '%s.out' % fname.split('.')[0]
#print transform, fname, outname
TRANSFORMS_TESTINFO += ((transform, fname, outname, normalize, 0),)
# ReST test cases
initialise('Products.PortalTransforms.transforms.rest', normalize_html, "rest*.rst")
# Python test cases
initialise('Products.PortalTransforms.transforms.python', normalize_html, "*.py")
# FIXME missing tests for image_to_html, st
TR_NAMES = None
def make_tests(test_descr=TRANSFORMS_TESTINFO):
"""generate tests classes from test info
return the list of generated test classes
"""
tests = []
for _transform, tr_input, tr_output, _normalize, _subobjects in test_descr:
# load transform if necessary
if type(_transform) is type(''):
try:
_transform = load(_transform).register()
except MissingBinary:
# we are not interessted in tests with missing binaries
continue
except:
import traceback
traceback.print_exc()
continue
if TR_NAMES is not None and not _transform.name() in TR_NAMES:
print 'skip test for', _transform.name()
continue
class TransformTestSubclass(TransformTest):
input = input_file_path(tr_input)
output = output_file_path(tr_output)
transform = _transform
normalize = lambda x, y: _normalize(y)
subobjects = _subobjects
tests.append(TransformTestSubclass)
tests.append(PILTransformsTest)
return tests
def test_suite():
from unittest import TestSuite, makeSuite
suite = TestSuite()
for test in make_tests():
suite.addTest(makeSuite(test))
return suite
if __name__ == '__main__':
framework()
import re
import glob
from unittest import TestSuite
from sys import modules
from os.path import join, abspath, dirname, basename
def normalize_html(s):
s = re.sub(r"\s+", " ", s)
s = re.sub(r"(?s)\s+<", "<", s)
s = re.sub(r"(?s)>\s+", ">", s)
s = re.sub(r"\r", "", s)
return s
def build_test_suite(package_name,module_names,required=1):
"""
Utlitity for building a test suite from a package name
and a list of modules.
If required is false, then ImportErrors will simply result
in that module's tests not being added to the returned
suite.
"""
suite = TestSuite()
try:
for name in module_names:
the_name = package_name+'.'+name
__import__(the_name,globals(),locals())
suite.addTest(modules[the_name].test_suite())
except ImportError:
if required:
raise
return suite
PREFIX = abspath(dirname(__file__))
def input_file_path(file):
return join(PREFIX, 'input', file)
def output_file_path(file):
return join(PREFIX, 'output', file)
def matching_inputs(pattern):
return [basename(path) for path in glob.glob(join(PREFIX, "input", pattern))]
def load(dotted_name, globals=None):
""" load a python module from it's name """
mod = __import__(dotted_name, globals)
components = dotted_name.split('.')
for comp in components[1:]:
mod = getattr(mod, comp)
return mod
from rigging import transformer
import os
from stat import ST_MTIME
## BIG BAD FUNCTIONAL TEST OF OOo Word Conversion
## The interfaces work, but are not quite what we need
## I might have to back fill a chain from source/dest graphing
file = "/tmp/word.doc"
class curry:
def __init__(self, func, *fixed_args):
self.func = func
self.fixed_args = fixed_args
def __call__(self, *variable_args):
return apply(self.func, self.fixed_args +
variable_args)
data = open("/tmp/word.doc", "r").read()
data = transformer.convert("WordToHtml", data, filename="word.doc")
print data.getData()
### Register Transforms
### This is interesting because we don't expect all transforms to be
### available on all platforms. To do this we allow things to fail at
### two levels
### 1) Imports
### If the import fails the module is removed from the list and
### will not be processed/registered
### 2) Registration
### A second phase happens when the loaded modules register method
### is called and this produces an instance that will used to
### implement the transform, if register needs to fail for now it
### should raise an ImportError as well (dumb, I know)
from logging import DEBUG, ERROR
from Products.PortalTransforms.utils import log
from Products.PortalTransforms.libtransforms.utils import MissingBinary
modules = [
'st', # zopish
'rest', # docutils
'word_to_html', # uno, com, wvware
'safe_html', # extract <body> and remove potentially harmful tags
'html_body', # extract only the contents of the <body> tag
'html_to_text', # re based transform
'text_to_html', # wrap text in a verbatim env
'text_pre_to_html', # wrap text into a pre
'pdf_to_html', # sf.net/projects/pdftohtml
'pdf_to_text', # www.foolabs.com/xpdf
'rtf_to_html', # sf.net/projects/rtf-converter
'rtf_to_xml', # sf.net/projects/rtf2xml
'image_to_png', # transforms any image to a PNG image
'image_to_gif', # transforms any image to a GIF image
'image_to_jpeg', # transforms any image to a JPEG image
'image_to_pcx', # transforms any image to a PCX image
'image_to_ppm', # transforms any image to a PPM image
'image_to_tiff', # transforms any image to a TIFF image
'image_to_bmp', # transforms any image to a BMP image
'lynx_dump', # lynx -dump
'python', # python source files, no dependancies
'identity', # identity transform, no dependancies
]
g = globals()
transforms = []
for m in modules:
try:
ns = __import__(m, g, g, None)
transforms.append(ns.register())
except ImportError, e:
msg = "Problem importing module %s : %s" % (m, e)
log(msg, severity=ERROR)
except MissingBinary, e:
log(str(e), severity=DEBUG)
except Exception, e:
import traceback
traceback.print_exc()
log("Raised error %s for %s" % (e, m), severity=ERROR)
def initialize(engine):
for transform in transforms:
engine.registerTransform(transform)
from Products.PortalTransforms.interfaces import itransform
from Products.PortalTransforms.utils import log
WARNING=100
class BrokenTransform:
__implements__ = itransform
__name__ = "broken transform"
inputs = ("BROKEN",)
output = "BROKEN"
def __init__(self, id, module, error):
self.id = id
self.module = module
self.error = error
def name(self):
return self.__name__
def convert(self, orig, data, **kwargs):
# do the format
msg = "Calling convert on BROKEN transform %s (%s). Error: %s" % \
(self.id, self.module, self.error)
log(msg, severity=WARNING, id='PortalTransforms')
print msg
data.setData('')
return data
def register():
return broken()
from Products.PortalTransforms.interfaces import itransform
from Products.CMFDefault.utils import bodyfinder
class HTMLBody:
"""Simple transform which extracts the content of the body tag"""
__implements__ = itransform
__name__ = "html_body"
inputs = ('text/html',)
output = "text/html"
def __init__(self, name=None):
self.config_metadata = {
'inputs' : ('list', 'Inputs', 'Input(s) MIME type. Change with care.'),
}
if name:
self.__name__ = name
def name(self):
return self.__name__
def __getattr__(self, attr):
if attr == 'inputs':
return self.config['inputs']
if attr == 'output':
return self.config['output']
raise AttributeError(attr)
def convert(self, orig, data, **kwargs):
body = bodyfinder(orig)
data.setData(body)
return data
def register():
return HTMLBody()
from Products.PortalTransforms.libtransforms.retransform import retransform
class html_to_text(retransform):
inputs = ('text/html',)
output = 'text/plain'
def register():
# XXX convert entites with htmlentitydefs.name2codepoint ?
return html_to_text("html_to_text",
('<script [^>]>.*</script>(?im)', ''),
('<style [^>]>.*</style>(?im)', ''),
('<head [^>]>.*</head>(?im)', ''),
('(?im)<(h[1-6r]|address|p|ul|ol|dl|pre|div|center|blockquote|form|isindex|table)(?=\W)[^>]*>', ' '),
('<[^>]*>(?i)(?m)', ''),
)
"""
A simple identity transform
"""
__revision__ = '$Id: identity.py 4787 2005-08-19 21:43:41Z dreamcatcher $'
from Products.PortalTransforms.interfaces import itransform
class IdentityTransform:
""" Identity transform
return content unchanged.
"""
__implements__ = (itransform,)
__name__ = "rest_to_text"
def __init__(self, name=None, **kwargs):
self.config = {
'inputs' : ('text/x-rst',),
'output' : 'text/plain',
}
self.config_metadata = {
'inputs' : ('list', 'Inputs', 'Input(s) MIME type. Change with care.'),
'output' : ('string', 'Output', 'Output MIME type. Change with care.'),
}
self.config.update(kwargs)
def __getattr__(self, attr):
if attr == 'inputs':
return self.config['inputs']
if attr == 'output':
return self.config['output']
raise AttributeError(attr)
def name(self):
return self.__name__
def convert(self, data, cache, **kwargs):
cache.setData(data)
return cache
def register():
return IdentityTransform()
from Products.PortalTransforms.libtransforms.piltransform import PILTransforms
class image_to_bmp(PILTransforms):
__name__ = "image_to_bmp"
inputs = ('image/*', )
output = 'image/x-ms-bmp'
format = 'bmp'
def register():
return image_to_bmp()
from Products.PortalTransforms.libtransforms.piltransform import PILTransforms
class image_to_gif(PILTransforms):
__name__ = "image_to_gif"
inputs = ('image/*', )
output = 'image/gif'
format = 'gif'
def register():
return image_to_gif()
from Products.PortalTransforms.interfaces import itransform
class image_to_html:
__implements__ = itransform
__name__ = "image_to_html"
inputs = ('image/*', )
output = 'text/html'
def name(self):
return self.__name__
def convert(self, data, cache, **kwargs):
imageName = kwargs.get("image")
cache.setData('<img src="%s"/>' %imageName)
return cache
from Products.PortalTransforms.libtransforms.piltransform import PILTransforms
class image_to_jpeg(PILTransforms):
__name__ = "image_to_jpeg"
inputs = ('image/*', )
output = 'image/jpeg'
format = 'jpeg'
def register():
return image_to_jpeg()
from Products.PortalTransforms.libtransforms.piltransform import PILTransforms
class image_to_pcx(PILTransforms):
__name__ = "image_to_pcx"
inputs = ('image/*', )
output = 'image/pcx'
format = 'pcx'
def register():
return image_to_pcx()
from Products.PortalTransforms.libtransforms.piltransform import PILTransforms
class image_to_png(PILTransforms):
__name__ = "image_to_png"
inputs = ('image/*', )
output = 'image/png'
format = 'png'
def register():
return image_to_png()
from Products.PortalTransforms.libtransforms.piltransform import PILTransforms
class image_to_ppm(PILTransforms):
__name__ = "image_to_ppm"
inputs = ('image/*', )
output = 'image/x-portable-pixmap'
format = 'ppm'
def register():
return image_to_ppm()
from Products.PortalTransforms.libtransforms.piltransform import PILTransforms
class image_to_tiff(PILTransforms):
__name__ = "image_to_tiff"
inputs = ('image/*', )
output = 'image/tiff'
format = 'tiff'
def register():
return image_to_tiff()
"""
Uses lynx -dump
"""
from Products.PortalTransforms.interfaces import itransform
from Products.PortalTransforms.libtransforms.commandtransform import commandtransform
from Products.PortalTransforms.libtransforms.commandtransform import popentransform
import os
class lynx_dump(popentransform):
__implements__ = itransform
__name__ = "lynx_dump"
inputs = ('text/html',)
output = 'text/plain'
__version__ = '2004-07-02.1'
binaryName = "lynx"
# XXX does -stdin work on windows?
binaryArgs = "-dump -crawl -stdin"
useStdin = True
def getData(self, couterr):
lines = [ line for line in couterr.readlines() ]
return ''.join(lines[3:])
class old_lynx_dump(commandtransform):
__implements__ = itransform
__name__ = "lynx_dump"
inputs = ('text/html',)
output = 'text/plain'
binaryName = "lynx"
binaryArgs = "-dump"
def __init__(self):
commandtransform.__init__(self, binary=self.binaryName)
def convert(self, data, cache, **kwargs):
kwargs['filename'] = 'unknown.html'
tmpdir, fullname = self.initialize_tmpdir(data, **kwargs)
outname = "%s/%s.txt" % (tmpdir, orig_name)
self.invokeCommand(tmpdir, fullname, outname)
text = self.astext(outname)
self.cleanDir(tmpdir)
cache.setData(text)
return cache
def invokeCommand(self, tmpdir, inputname, outname):
os.system('cd "%s" && %s %s "%s" 1>"%s" 2>/dev/null' % \
(tmpdir, self.binary, self.binaryArgs, inputname, outname))
def astext(self, outname):
txtfile = open("%s" % (outname), 'r')
txt = txtfile.read()
txtfile.close()
return txt
def register():
return lynx_dump()
import win32com, sys, string, win32api, traceback, re, tempfile, os
import win32com.client
# from win32com.test.util import CheckClean
import pythoncom
from win32com.client import gencache
from win32com.client import constants, Dispatch
from pywintypes import Unicode
import os.path
from Products.PortalTransforms.libtransforms.commandtransform import commandtransform
from Products.PortalTransforms.libtransforms.utils import bodyfinder, scrubHTML
class document(commandtransform):
def __init__(self, name, data):
"""Initialization: create tmp work
directory and copy the document into a file"""
commandtransform.__init__(self, name)
name = self.name()
if not name.endswith('.doc'):
name = name + ".doc"
self.tmpdir, self.fullname = self.initialize_tmpdir(data, filename=name)
def convert(self):
try:
# initialize COM for multi-threading, ignoring any errors
# when someone else has already initialized differently.
pythoncom.CoInitializeEx(pythoncom.COINIT_MULTITHREADED)
except pythoncom.com_error:
pass
word = Dispatch("Word.Application")
word.Visible = 0
word.DisplayAlerts = 0
doc = word.Documents.Open(self.fullname)
# Let's set up some html saving options for this document
doc.WebOptions.RelyOnCSS = 1
doc.WebOptions.OptimizeForBrowser = 1
doc.WebOptions.BrowserLevel = 0 # constants.wdBrowserLevelV4
doc.WebOptions.OrganizeInFolder = 0
doc.WebOptions.UseLongFileNames = 1
doc.WebOptions.RelyOnVML = 0
doc.WebOptions.AllowPNG = 1
# And then save the document into HTML
doc.SaveAs(FileName = "%s.htm" % (self.fullname),
FileFormat = 8) # constants.wdFormatHTML)
# TODO -- Extract Metadata (author, title, keywords) so we
# can populate the dublin core
# Converter will need to be extended to return a dict of
# possible MD fields
doc.Close()
# word.Quit()
def html(self):
htmlfile = open(self.fullname + '.htm', 'r')
html = htmlfile.read()
htmlfile.close()
html = scrubHTML(html)
body = bodyfinder(html)
return body
## This function has to be done. It's more difficult to delete the temp
## directory under Windows, because there is sometimes a directory in it.
## def cleanDir(self, tmpdir):
import re, os, tempfile
import uno
import unohelper
from com.sun.star.beans import PropertyValue
from Products.PortalTransforms.libtransforms.commandtransform import commandtransform
from Products.PortalTransforms.libtransforms.utils import bodyfinder, scrubHTML
class document(commandtransform):
def __init__(self, name, data):
""" Initialization: create tmp work directory and copy the
document into a file"""
commandtransform.__init__(self, name)
name = self.name()
self.tmpdir, self.fullname = self.initialize_tmpdir(data, filename=name)
self.outputfile = self.fullname + '.html'
def convert(self):
"""Convert the document"""
localContext = uno.getComponentContext()
resolver = localContext.ServiceManager.createInstanceWithContext(
'com.sun.star.bridge.UnoUrlResolver', localContext )
ctx = resolver.resolve(
'uno:socket,host=localhost,port=2002;'
'urp;StarOffice.ComponentContext')
smgr = ctx.ServiceManager
desktop = smgr.createInstanceWithContext(
'com.sun.star.frame.Desktop', ctx)
# load the document
url = unohelper.systemPathToFileUrl(self.fullname)
doc = desktop.loadComponentFromURL(url, '_blank', 0, ())
filterName = 'swriter: HTML (StarWriter)'
storeProps = (
PropertyValue('FilterName', 0, filterName, 0),
)
# pre-create a empty file for security reason
url = unohelper.systemPathToFileUrl(self.outputfile)
doc.storeToURL(url, storeProps)
try:
doc.close(True)
except com.sun.star.util.CloseVetoException:
pass
# maigic to release some resource
ctx.ServiceManager
def html(self):
htmlfile = open(self.outputfile, 'r')
html = htmlfile.read()
htmlfile.close()
html = scrubHTML(html)
body = bodyfinder(html)
return body
import re, tempfile
import os, os.path
from Products.PortalTransforms.libtransforms.utils import bin_search, \
sansext, bodyfinder, scrubHTML
from Products.PortalTransforms.libtransforms.commandtransform import commandtransform
class document(commandtransform):
def __init__(self, name, data):
""" Initialization: create tmp work directory and copy the
document into a file"""
commandtransform.__init__(self, name, binary="wvHtml")
name = self.name()
if not name.endswith('.doc'):
name = name + ".doc"
self.tmpdir, self.fullname = self.initialize_tmpdir(data, filename=name)
def convert(self):
"Convert the document"
tmpdir = self.tmpdir
# for windows, install wvware from GnuWin32 at C:\Program Files\GnuWin32\bin
# you can use:
# wvware.exe -c ..\share\wv\wvHtml.xml --charset=utf-8 -d d:\temp d:\temp\test.doc > test.html
if os.name == 'posix':
os.system('cd "%s" && %s --charset=utf-8 "%s" "%s.html"' % (tmpdir, self.binary,
self.fullname,
self.__name__))
def html(self):
htmlfile = open("%s/%s.html" % (self.tmpdir, self.__name__), 'r')
html = htmlfile.read()
htmlfile.close()
html = scrubHTML(html)
body = bodyfinder(html)
return body
"""
Uses the http://sf.net/projects/pdftohtml bin to do its handy work
"""
from Products.PortalTransforms.interfaces import itransform
from Products.PortalTransforms.libtransforms.utils import bin_search, sansext
from Products.PortalTransforms.libtransforms.commandtransform import commandtransform
from Products.PortalTransforms.libtransforms.commandtransform import popentransform
from Products.CMFDefault.utils import bodyfinder
import os
class popen_pdf_to_html(popentransform):
__implements__ = itransform
__version__ = '2004-07-02.01'
__name__ = "pdf_to_html"
inputs = ('application/pdf',)
output = 'text/html'
output_encoding = 'utf-8'
binaryName = "pdftohtml"
binaryArgs = "%(infile)s -noframes -stdout -enc UTF-8"
useStdin = False
def getData(self, couterr):
return bodyfinder(couterr.read())
class pdf_to_html(commandtransform):
__implements__ = itransform
__name__ = "pdf_to_html"
inputs = ('application/pdf',)
output = 'text/html'
output_encoding = 'utf-8'
binaryName = "pdftohtml"
binaryArgs = "-noframes -enc UTF-8"
def __init__(self):
commandtransform.__init__(self, binary=self.binaryName)
def convert(self, data, cache, **kwargs):
kwargs['filename'] = 'unknown.pdf'
tmpdir, fullname = self.initialize_tmpdir(data, **kwargs)
html = self.invokeCommand(tmpdir, fullname)
path, images = self.subObjects(tmpdir)
objects = {}
if images:
self.fixImages(path, images, objects)
self.cleanDir(tmpdir)
cache.setData(bodyfinder(html))
cache.setSubObjects(objects)
return cache
def invokeCommand(self, tmpdir, fullname):
if os.name=='posix':
cmd = 'cd "%s" && %s %s "%s" 2>error_log 1>/dev/null' % (
tmpdir, self.binary, self.binaryArgs, fullname)
else:
cmd = 'cd "%s" && %s %s "%s"' % (
tmpdir, self.binary, self.binaryArgs, fullname)
os.system(cmd)
try:
htmlfilename = os.path.join(tmpdir, sansext(fullname) + '.html')
htmlfile = open(htmlfilename, 'r')
html = htmlfile.read()
htmlfile.close()
except:
try:
return open("%s/error_log" % tmpdir, 'r').read()
except:
return "transform failed while running %s (maybe this pdf file doesn't support transform)" % cmd
return html
def register():
return pdf_to_html()
"""
Uses the xpdf (www.foolabs.com/xpdf)
"""
from Products.PortalTransforms.interfaces import itransform
from Products.PortalTransforms.libtransforms.utils import bin_search, sansext
from Products.PortalTransforms.libtransforms.commandtransform import commandtransform
from Products.PortalTransforms.libtransforms.commandtransform import popentransform
import os
class pdf_to_text(popentransform):
__implements__ = itransform
__name__ = "pdf_to_text"
inputs = ('application/pdf',)
output = 'text/plain'
output_encoding = 'utf-8'
__version__ = '2004-07-02.01'
binaryName = "pdftotext"
binaryArgs = "%(infile)s -enc UTF-8 -"
useStdin = False
class old_pdf_to_text(commandtransform):
__implements__ = itransform
__name__ = "pdf_to_text"
inputs = ('application/pdf',)
output = 'text/plain'
output_encoding = 'utf-8'
binaryName = "pdftotext"
def __init__(self):
commandtransform.__init__(self, binary=self.binaryName)
def convert(self, data, cache, **kwargs):
kwargs['filename'] = 'unkown.pdf'
tmpdir, fullname = self.initialize_tmpdir(data, **kwargs)
text = self.invokeCommand(tmpdir, fullname)
path, images = self.subObjects(tmpdir)
objects = {}
if images:
self.fixImages(path, images, objects)
self.cleanDir(tmpdir)
cache.setData(text)
cache.setSubObjects(objects)
return cache
def invokeCommand(self, tmpdir, fullname):
# FIXME: windows users...
textfile = "%s/%s.txt" % (tmpdir, sansext(fullname))
cmd = 'cd "%s" && %s -enc UTF-8 "%s" "%s" 2>error_log 1>/dev/null' % (
tmpdir, self.binary, fullname, textfile)
os.system(cmd)
try:
text = open(textfile).read()
except:
try:
return open("%s/error_log" % tmpdir, 'r').read()
except:
return ''
return text
def register():
return pdf_to_text()
"""
Original code from active state recipe
'Colorize Python source using the built-in tokenizer'
----------------------------------------------------------------------------
MoinMoin - Python Source Parser
This code is part of MoinMoin (http://moin.sourceforge.net/) and converts
Python source code to HTML markup, rendering comments, keywords, operators,
numeric and string literals in different colors.
It shows how to use the built-in keyword, token and tokenize modules
to scan Python source code and re-emit it with no changes to its
original formatting (which is the hard part).
"""
__revision__ = '$Id: python.py 3661 2005-02-23 17:05:31Z tiran $'
import string
import keyword, token, tokenize
from cStringIO import StringIO
from Products.PortalTransforms.interfaces import itransform
from DocumentTemplate.DT_Util import html_quote
## Python Source Parser #####################################################
_KEYWORD = token.NT_OFFSET + 1
_TEXT = token.NT_OFFSET + 2
class Parser:
""" Send colored python source.
"""
def __init__(self, raw, tags, out):
""" Store the source text.
"""
self.raw = string.strip(string.expandtabs(raw))
self.out = out
self.tags = tags
def format(self):
""" Parse and send the colored source.
"""
# store line offsets in self.lines
self.lines = [0, 0]
pos = 0
while 1:
pos = string.find(self.raw, '\n', pos) + 1
if not pos: break
self.lines.append(pos)
self.lines.append(len(self.raw))
# parse the source and write it
self.pos = 0
text = StringIO(self.raw)
self.out.write('<pre class="python">\n')
try:
tokenize.tokenize(text.readline, self)
except tokenize.TokenError, ex:
msg = ex[0]
line = ex[1][0]
self.out.write("<h5 class='error>'ERROR: %s%s</h5>" % (
msg, self.raw[self.lines[line]:]))
self.out.write('\n</pre>\n')
def __call__(self, toktype, toktext, (srow,scol), (erow,ecol), line):
""" Token handler.
"""
#print "type", toktype, token.tok_name[toktype], "text", toktext,
#print "start", srow,scol, "end", erow,ecol, "<br>"
## calculate new positions
oldpos = self.pos
newpos = self.lines[srow] + scol
self.pos = newpos + len(toktext)
## handle newlines
if toktype in [token.NEWLINE, tokenize.NL]:
self.out.write('\n')
return
## send the original whitespace, if needed
if newpos > oldpos:
self.out.write(self.raw[oldpos:newpos])
## skip indenting tokens
if toktype in [token.INDENT, token.DEDENT]:
self.pos = newpos
return
## map token type to a group
if token.LPAR <= toktype and toktype <= token.OP:
toktype = 'OP'
elif toktype == token.NAME and keyword.iskeyword(toktext):
toktype = 'KEYWORD'
else:
toktype = tokenize.tok_name[toktype]
open_tag = self.tags.get('OPEN_'+toktype, self.tags['OPEN_TEXT'])
close_tag = self.tags.get('CLOSE_'+toktype, self.tags['CLOSE_TEXT'])
## send text
self.out.write(open_tag)
self.out.write(html_quote(toktext))
self.out.write(close_tag)
class PythonTransform:
"""Colorize Python source files
"""
__implements__ = itransform
__name__ = "python_to_html"
inputs = ("text/x-python",)
output = "text/html"
config = {
'OPEN_NUMBER': '<span style="color: #0080C0;">',
'CLOSE_NUMBER': '</span>',
'OPEN_OP': '<span style="color: #0000C0;">',
'CLOSE_OP': '</span>',
'OPEN_STRING': '<span style="color: #004080;">',
'CLOSE_STRING': '</span>',
'OPEN_COMMENT': '<span style="color: #008000;">',
'CLOSE_COMMENT': '</span>',
'OPEN_NAME': '<span style="color: #000000;">',
'CLOSE_NAME': '</span>',
'OPEN_ERRORTOKEN': '<span style="color: #FF8080;">',
'CLOSE_ERRORTOKEN': '</span>',
'OPEN_KEYWORD': '<span style="color: #C00000;">',
'CLOSE_KEYWORD': '</span>',
'OPEN_TEXT': '',
'CLOSE_TEXT': '',
}
def name(self):
return self.__name__
def convert(self, orig, data, **kwargs):
dest = StringIO()
Parser(orig, self.config, dest).format()
data.setData(dest.getvalue())
return data
def register():
return PythonTransform()
from Products.PortalTransforms.interfaces import itransform
from reStructuredText import HTML
import sys
class rest:
__implements__ = itransform
__name__ = "rest_to_html"
inputs = ("text/x-rst", "text/restructured",)
output = "text/html"
def name(self):
return self.__name__
def convert(self, orig, data, **kwargs):
# do the format
encoding = kwargs.get('encoding', 'utf-8')
input_encoding = kwargs.get('input_encoding', encoding)
output_encoding = kwargs.get('output_encoding', encoding)
language = kwargs.get('language', 'en')
warnings = kwargs.get('warnings', None)
settings = {'documentclass': '',
'traceback': 1,
}
html = HTML(orig,
input_encoding=input_encoding,
output_encoding=output_encoding,
language_code=language,
initial_header_level=2,
warnings=warnings,
settings=settings)
html = html.replace(' class="document"', '', 1)
data.setData(html)
return data
def register():
return rest()
"""
Uses the http://freshmeat.net/projects/rtfconverter/ bin to do its handy work
"""
from Products.PortalTransforms.interfaces import itransform
from Products.PortalTransforms.libtransforms.utils import bin_search, sansext
from Products.PortalTransforms.libtransforms.commandtransform import commandtransform
from Products.CMFDefault.utils import bodyfinder
import os
class rtf_to_html(commandtransform):
__implements__ = itransform
__name__ = "rtf_to_html"
inputs = ('application/rtf',)
output = 'text/html'
binaryName = "rtf-converter"
def __init__(self):
commandtransform.__init__(self, binary=self.binaryName)
def convert(self, data, cache, **kwargs):
kwargs['filename'] = 'unknow.rtf'
tmpdir, fullname = self.initialize_tmpdir(data, **kwargs)
html = self.invokeCommand(tmpdir, fullname)
path, images = self.subObjects(tmpdir)
objects = {}
if images:
self.fixImages(path, images, objects)
self.cleanDir(tmpdir)
cache.setData(bodyfinder(html))
cache.setSubObjects(objects)
return cache
def invokeCommand(self, tmpdir, fullname):
# FIXME: windows users...
htmlfile = "%s/%s.html" % (tmpdir, sansext(fullname))
cmd = 'cd "%s" && %s -o %s "%s" 2>error_log 1>/dev/null' % (
tmpdir, self.binary, htmlfile, fullname)
os.system(cmd)
try:
html = open(htmlfile).read()
except:
try:
return open("%s/error_log" % tmpdir, 'r').read()
except:
return ''
return html
def register():
return rtf_to_html()
"""
Uses the http://sf.net/projects/rtf2xml bin to do its handy work
"""
from Products.PortalTransforms.interfaces import itransform
from Products.PortalTransforms.libtransforms.utils import bin_search, sansext
from Products.PortalTransforms.libtransforms.commandtransform import commandtransform
import os
class rtf_to_xml(commandtransform):
__implements__ = itransform
__name__ = "rtf_to_xml"
inputs = ('application/rtf',)
output = 'text/xml'
binaryName = "rtf2xml"
def __init__(self):
commandtransform.__init__(self, binary=self.binaryName)
def convert(self, data, cache, **kwargs):
kwargs['filename'] = 'unknown.rtf'
tmpdir, fullname = self.initialize_tmpdir(data, **kwargs)
xml = self.invokeCommand(tmpdir, fullname)
path, images = self.subObjects(tmpdir)
objects = {}
if images:
self.fixImages(path, images, objects)
self.cleanDir(tmpdir)
cache.setData(xml)
cache.setSubObjects(objects)
return cache
def invokeCommand(self, tmpdir, fullname):
# FIXME: windows users...
xmlfile = "%s/%s.xml" % (tmpdir, sansext(fullname))
cmd = 'cd "%s" && %s -o %s "%s" 2>error_log 1>/dev/null' % (
tmpdir, self.binary, xmlfile, fullname)
os.system(cmd)
try:
xml = open(xmlfile).read()
except:
try:
return open("%s/error_log" % tmpdir, 'r').read()
except:
return ''
return xml
def register():
return rtf_to_xml()
import logging
from sgmllib import SGMLParser
from Products.PortalTransforms.interfaces import itransform
from Products.PortalTransforms.utils import log
from Products.CMFDefault.utils import bodyfinder
from Products.CMFDefault.utils import IllegalHTML
from Products.CMFDefault.utils import SimpleHTMLParser
from Products.CMFDefault.utils import VALID_TAGS
from Products.CMFDefault.utils import NASTY_TAGS
# tag mapping: tag -> short or long tag
VALID_TAGS = VALID_TAGS.copy()
NASTY_TAGS = NASTY_TAGS.copy()
# add some tags to allowed types. This should be fixed in CMFDefault
VALID_TAGS['ins'] = 1
VALID_TAGS['del'] = 1
VALID_TAGS['q'] = 1
VALID_TAGS['map']=1
VALID_TAGS['area']=1
msg_pat = """
<div class="system-message">
<p class="system-message-title">System message: %s</p>
%s</d>
"""
class StrippingParser(SGMLParser):
"""Pass only allowed tags; raise exception for known-bad.
Copied from Products.CMFDefault.utils
Copyright (c) 2001 Zope Corporation and Contributors. All Rights Reserved.
"""
from htmlentitydefs import entitydefs # replace entitydefs from sgmllib
def __init__(self, valid, nasty, remove_javascript, raise_error):
SGMLParser.__init__( self )
self.result = []
self.valid = valid
self.nasty = nasty
self.remove_javascript = remove_javascript
self.raise_error = raise_error
self.suppress = False
def handle_data(self, data):
if self.suppress: return
if data:
self.result.append(data)
def handle_charref(self, name):
if self.suppress: return
self.result.append('&#%s;' % name)
def handle_comment(self, comment):
pass
def handle_decl(self, data):
pass
def handle_entityref(self, name):
if self.suppress: return
if self.entitydefs.has_key(name):
x = ';'
else:
# this breaks unstandard entities that end with ';'
x = ''
self.result.append('&%s%s' % (name, x))
def unknown_starttag(self, tag, attrs):
""" Delete all tags except for legal ones.
"""
if self.suppress: return
#if self.remove_javascript and tag == script and :
if self.valid.has_key(tag):
self.result.append('<' + tag)
for k, v in attrs:
if self.remove_javascript and k.strip().lower().startswith('on'):
if not self.raise_error: continue
else: raise IllegalHTML, 'Javascript event "%s" not allowed.' % k
elif self.remove_javascript and v.strip().lower().startswith('javascript:' ):
if not self.raise_error: continue
else: raise IllegalHTML, 'Javascript URI "%s" not allowed.' % v
else:
self.result.append(' %s="%s"' % (k, v))
#UNUSED endTag = '</%s>' % tag
if self.valid.get(tag):
self.result.append('>')
else:
self.result.append(' />')
elif self.nasty.has_key(tag):
self.suppress = True
if self.raise_error:
raise IllegalHTML, 'Dynamic tag "%s" not allowed.' % tag
else:
# omit tag
pass
def unknown_endtag(self, tag):
if self.nasty.has_key(tag):
self.suppress = False
if self.suppress: return
if self.valid.get(tag):
self.result.append('</%s>' % tag)
#remTag = '</%s>' % tag
def getResult(self):
return ''.join(self.result)
def scrubHTML(html, valid=VALID_TAGS, nasty=NASTY_TAGS,
remove_javascript=True, raise_error=True):
""" Strip illegal HTML tags from string text.
"""
parser = StrippingParser(valid=valid, nasty=nasty,
remove_javascript=remove_javascript,
raise_error=raise_error)
parser.feed(html)
parser.close()
return parser.getResult()
class SafeHTML:
"""Simple transform which uses CMFDefault functions to
clean potentially bad tags.
Tags must explicit be allowed in valid_tags to pass. Only the tags
themself are removed, not their contents. If tags are
removed and in nasty_tags, they are removed with
all of their contents.
Be aware that you may need to clean the cache to let a Object
call the transform again.
"""
__implements__ = itransform
__name__ = "safe_html"
inputs = ('text/html',)
output = "text/x-html-safe"
def __init__(self, name=None, **kwargs):
self.config = {
'inputs': self.inputs,
'output': self.output,
'valid_tags': VALID_TAGS,
'nasty_tags': NASTY_TAGS,
'remove_javascript': 1,
'disable_transform': 0,
}
self.config_metadata = {
'inputs' : ('list', 'Inputs', 'Input(s) MIME type. Change with care.'),
'valid_tags' : ('dict',
'valid_tags',
'List of valid html-tags, value is 1 if they ' +
'have a closing part (e.g. <p>...</p>) and 0 for empty ' +
'tags (like <br />). Be carefull!',
('tag', 'value')),
'nasty_tags' : ('dict',
'nasty_tags',
'Dynamic Tags that are striped with ' +
'everything they contain (like applet, object). ' +
'They are only deleted if they are not marked as valid_tags.',
('tag', 'value')),
'remove_javascript' : ("int",
'remove_javascript',
'1 to remove javascript attributes that begin with on (e.g. onClick) ' +
'and attributes where the value starts with "javascript:" ' +
'(e.g. <a href="javascript:function()". ' +
'This does not effect <script> tags. 0 to leave the attributes.'),
'disable_transform' : ("int",
'disable_transform',
'If 1, nothing is done.')
}
self.config.update(kwargs)
if name:
self.__name__ = name
def name(self):
return self.__name__
def __getattr__(self, attr):
if attr == 'inputs':
return self.config['inputs']
if attr == 'output':
return self.config['output']
raise AttributeError(attr)
def convert(self, orig, data, **kwargs):
# note if we need an upgrade.
if not self.config.has_key('disable_transform'):
log(logging.ERROR, 'PortalTransforms safe_html transform needs to be '
'updated. Please re-install the PortalTransforms product to fix.')
# if we have a config that we don't want to delete
# we need a disable option
if self.config.get('disable_transform'):
data.setData(orig)
return data
try:
safe = scrubHTML(
bodyfinder(orig),
valid=self.config.get('valid_tags', {}),
nasty=self.config.get('nasty_tags', {}),
remove_javascript=self.config.get('remove_javascript', True),
raise_error=False)
except IllegalHTML, inst:
data.setData(msg_pat % ("Error", str(inst)))
else:
data.setData(safe)
return data
def register():
return SafeHTML()
from StructuredText.StructuredText import HTML
from Products.PortalTransforms.interfaces import itransform
DEFAULT_STX_LEVEL = 2
STX_LEVEL = DEFAULT_STX_LEVEL
class st:
__implements__ = itransform
__name__ = "st_to_html"
inputs = ("text/structured",)
output = "text/html"
def name(self):
return self.__name__
def convert(self, orig, data, level=None, **kwargs):
if level is None:
level = STX_LEVEL
data.setData(HTML(orig, level=level, header=0))
return data
def register():
return st()
from Products.PortalTransforms.interfaces import itransform
from DocumentTemplate.DT_Util import html_quote
__revision__ = '$Id: text_pre_to_html.py 3658 2005-02-23 16:29:54Z tiran $'
class TextPreToHTML:
"""simple transform which wraps raw text into a <pre> tag"""
__implements__ = itransform
__name__ = "text-pre_to_html"
inputs = ('text/plain-pre',)
output = "text/html"
def __init__(self, name=None):
self.config_metadata = {
'inputs' : ('list', 'Inputs', 'Input(s) MIME type. Change with care.'),
}
if name:
self.__name__ = name
def name(self):
return self.__name__
def __getattr__(self, attr):
if attr == 'inputs':
return self.config['inputs']
if attr == 'output':
return self.config['output']
raise AttributeError(attr)
def convert(self, orig, data, **kwargs):
data.setData('<pre class="data">%s</pre>' % html_quote(orig))
return data
def register():
return TextPreToHTML()
from Products.PortalTransforms.interfaces import itransform
from DocumentTemplate.DT_Util import html_quote
__revision__ = '$Id: text_to_html.py 4787 2005-08-19 21:43:41Z dreamcatcher $'
class TextToHTML:
"""simple transform which wrap raw text in a verbatim environment"""
__implements__ = itransform
__name__ = "text_to_html"
output = "text/html"
def __init__(self, name=None, inputs=('text/plain',)):
self.config = { 'inputs' : inputs, }
self.config_metadata = {
'inputs' : ('list', 'Inputs', 'Input(s) MIME type. Change with care.'),
}
if name:
self.__name__ = name
def name(self):
return self.__name__
def __getattr__(self, attr):
if attr == 'inputs':
return self.config['inputs']
if attr == 'output':
return self.config['output']
raise AttributeError(attr)
def convert(self, orig, data, **kwargs):
# Replaces all line breaks with a br tag, and wraps it in a p tag.
data.setData('<p>%s</p>' % html_quote(orig.strip()).replace('\n', '<br />'))
return data
def register():
return TextToHTML()
UNO_TYPES=file:///usr/lib/openoffice/program/applicat.rdb
UNO_SERVICES=file:///usr/lib/openoffice/program/applicat.rdb
import os, os.path, sys
def package_home(globals_dict):
__name__=globals_dict['__name__']
m=sys.modules[__name__]
if hasattr(m,'__path__'):
r=m.__path__[0]
elif "." in __name__:
r=sys.modules[__name__[:__name__.rfind('.')]].__path__[0]
else:
r=__name__
return os.path.abspath(os.path.join(os.getcwd(), r))
setup_ini = 'file://%s/uno.ini' % package_home(globals())
import PyUNO
class uno:
def __init__ ( self, connection='socket,host=localhost,port=2002;urp', setup=setup_ini ):
""" do the bootstrap
connection can be one or more of the following:
socket,
host = localhost | <hostname> | <ip-addr>,
port = <port>,
service = soffice,
user = <username>,
password = <password>
;urp
"""
self.XComponentContext = PyUNO.bootstrap ( setup )
self.XUnoUrlResolver, o = \
self.XComponentContext.ServiceManager.createInstanceWithContext ( 'com.sun.star.bridge.UnoUrlResolver', self.XComponentContext )
self.XNamingService, o = self.XUnoUrlResolver.resolve ( 'uno:%s;StarOffice.NamingService' % connection )
self.XMultiServiceFactory, o = self.XNamingService.getRegisteredObject ('StarOffice.ServiceManager')
self.XComponentLoader, o = \
self.XMultiServiceFactory.createInstance ( 'com.sun.star.frame.Desktop' )
def new ( self, what, where='_blank', no=0, propertyValues=() ):
return self.XComponentLoader.loadComponentFromURL (
what, where, no, propertyValues )
def newIdlStruct ( self, type ):
return PyUNO.createIdlStruct ( self.XMultiServiceFactory, type )
def newCalc (self):
return self.new ('private:factory/scalc')
def newWriter (self):
return self.new ('private:factory/swriter')
def newImpress (self):
return self.new ('private:factory/simpress')
def newDraw (self):
return self.new ('private:factory/sdraw')
def newPropertyValue (self, propertyValue={} ):
property = self.newIdlStruct ( 'com.sun.star.beans.PropertyValue' )
if propertyValue.has_key('Name'):
property.Name = propertyValue['Name']
if propertyValue.has_key('Value'):
property.Value = propertyValue['Value']
if propertyValue.has_key('State'):
property.State = propertyValue['State']
if propertyValue.has_key('Handle'):
property.Handle = propertyValue['Handle']
return property
def newPropertyValues ( self, propertyValues=[] ):
list = ()
l = len(propertyValues)
for p in range (l):
list = list + ( self.newPropertyValue ( propertyValues.pop(0) ), )
return list
def newBoolean ( self, bool=0 ):
if bool:
return PyUNO.true()
else:
return PyUNO.false()
from Products.PortalTransforms.interfaces import itransform
EXTRACT_BODY = 1
EXTRACT_STYLE = 0
FIX_IMAGES = 1
IMAGE_PREFIX = "img_"
# disable office_uno because it doesn't support multithread yet
ENABLE_UNO = False
import os
if os.name == 'posix':
try:
if ENABLE_UNO:
from office_uno import document
else:
raise
except:
from office_wvware import document
else:
try:
if ENABLE_UNO:
from office_uno import document
else:
raise
except:
from office_com import document
import os.path
class word_to_html:
__implements__ = itransform
__name__ = "word_to_html"
inputs = ('application/msword',)
output = 'text/html'
output_encoding = 'utf-8'
tranform_engine = document.__module__
def name(self):
return self.__name__
def convert(self, data, cache, **kwargs):
orig_file = 'unknown.doc'
doc = document(orig_file, data)
doc.convert()
html = doc.html()
path, images = doc.subObjects(doc.tmpdir)
objects = {}
if images:
doc.fixImages(path, images, objects)
doc.cleanDir(doc.tmpdir)
cache.setData(html)
cache.setSubObjects(objects)
return cache
def register():
return word_to_html()
"""try to build some usefull transformations with the command and xml
transforms and the available binaries
"""
from Products.PortalTransforms.libtransforms.utils import bin_search, MissingBinary
COMMAND_CONFIGS = (
('lynx_dump', '.html',
{'binary_path' : 'lynx',
'command_line' : '-dump %s',
'inputs' : ('text/html',),
'output' : 'text/plain',
}),
('tidy_html', '.html',
{'binary_path' : 'tidy',
'command_line' : '%s',
'inputs' : ('text/html',),
'output' : 'text/html',
}),
('rtf_to_html', None,
{'binary_path' : 'unrtf',
'command_line' : '%s',
'inputs' : ('application/rtf',),
'output' : 'text/html',
}),
('ppt_to_html', None,
{'binary_path' : 'ppthtml',
'command_line' : '%s',
'inputs' : ('application/vnd.ms-powerpoint',),
'output' : 'text/html',
}),
('excel_to_html', None,
{'binary_path' : 'xlhtml',
'command_line' : '-nh -a %s',
'inputs' : ('application/vnd.ms-excel',),
'output' : 'text/html',
}),
('ps_to_text', None,
{'binary_path' : 'ps2ascii',
'command_line' : '%s',
'inputs' : ('application/postscript',),
'output' : 'text/plain',
}),
)
TRANSFORMS = {}
from command import ExternalCommandTransform
for tr_name, extension, config in COMMAND_CONFIGS:
try:
bin = bin_search(config['binary_path'])
except MissingBinary:
print 'no such binary', config['binary_path']
else:
tr = ExternalCommandTransform(tr_name, extension)
tr.config['binary_path'] = bin
tr.__name__ = tr_name
tr.config = config
TRANSFORMS[tr_name] = tr
XMLPROCS_CONF = {
'xsltproc' : '--catalogs --xinclude -o %(output)s %(transform)s %(input)s',
'4xslt' : ' -o %(output)s %(input)s %(transform)s'
}
bin = None
for proc in XMLPROCS_CONF.keys():
try:
bin = bin_search(proc)
break
except MissingBinary:
print 'no such binary', proc
if bin is not None:
print 'Using %s as xslt processor' % bin
from xml import XsltTransform
for output in ('html', 'plain'):
name = "xml_to_" + output
command_line = XMLPROCS_CONF[proc]
tr = XsltTransform(name=name, inputs=('text/xml',), output='text/'+output,
binary_path=bin, command_line=command_line)
TRANSFORMS[name] = tr
def initialize(engine):
for transform in TRANSFORMS.values():
engine.registerTransform(transform)
"""
A custom transform using external command
"""
__revision__ = '$Id: command.py 4439 2005-06-15 16:32:36Z panjunyong $'
import os.path
from os import popen3
from Products.PortalTransforms.interfaces import itransform
from Products.PortalTransforms.libtransforms.utils import bin_search, sansext
from Products.PortalTransforms.libtransforms.commandtransform import commandtransform
from Products.PortalTransforms.utils import log
class ExternalCommandTransform(commandtransform):
""" Custom external command
transform content by launching an external command
the command should take the content in an input file (designed by '%s' in
the command line parameters) and return output on stdout.
Input and output mime types must be set correctly !
"""
__implements__ = (itransform,)
__name__ = "command_transform"
def __init__(self, name=None, input_extension=None, **kwargs):
self.config = {
'binary_path' : '',
'command_line' : '',
'inputs' : ('text/plain',),
'output' : 'text/plain',
}
self.config_metadata = {
'binary_path' : ('string', 'Binary path',
'Path of the executable on the server.'),
'command_line' : ('string', 'Command line',
'''Additional command line option.
There should be at least the input file (designed by "%(input)s").
The transformation\'s result must be printed on stdout.
'''),
'inputs' : ('list', 'Inputs', 'Input(s) MIME type. Change with care.'),
'output' : ('string', 'Output', 'Output MIME type. Change with care.'),
}
self.config.update(kwargs)
commandtransform.__init__(self, name=name, binary=self.config['binary_path'], **kwargs)
# use the full binary path
self.config.update({'binary_path':self.binary})
self.input_extension = input_extension
def __getattr__(self, attr):
if attr == 'inputs':
return self.config['inputs']
if attr == 'output':
return self.config['output']
raise AttributeError(attr)
def convert(self, data, cache, **kwargs):
filename = kwargs.get('filename') or 'unknown'
if self.input_extension is not None:
kwargs['filename'] = 'unknown' + self.input_extension
else:
kwargs['filename'] = 'unknown' + os.path.splitext(filename)[-1]
tmpdir, fullname = self.initialize_tmpdir(data, **kwargs)
data = self.invokeCommand(fullname)
cache.setData(data)
path, images = self.subObjects(tmpdir)
objects = {}
if images:
self.fixImages(path, images, objects)
cache.setSubObjects(objects)
self.cleanDir(tmpdir)
return cache
def invokeCommand(self, input_name):
command = '%(binary_path)s %(command_line)s' % self.config
input, output, error = popen3(command % input_name)
input.close()
# first read stderr, else we may hang on stout
# but, still hang my windows, so commented it :-(
# error_data = error.read()
error_data = 'error while running "%s"' % (command % input_name)
error.close()
data = output.read()
output.close()
if error_data and not data:
data = error_data
else:
log('Error while running "%s":\n %s' % (command % input_name,
error_data))
return data
def register():
return ExternalCommandTransform()
"""
A custom transform using external command
"""
__revision__ = '$Id: xml.py 4787 2005-08-19 21:43:41Z dreamcatcher $'
from os.path import join, dirname, exists
import re
from os import popen3, popen4, system
from cStringIO import StringIO
from Products.PortalTransforms.interfaces import itransform
from Products.PortalTransforms.libtransforms.utils import bin_search, sansext
from Products.PortalTransforms.libtransforms.commandtransform import commandtransform
from Products.PortalTransforms.utils import log
class XsltTransform(commandtransform):
""" Custom external command
transform xml content by launching an external XSLT processor
Input and output mime types must be set correctly !
You can associate different document type to different transformations.
"""
__implements__ = (itransform,)
__name__ = "xml_to_html"
def __init__(self, name=None, **kwargs):
self.config = {
# sample configuration
'binary_path' : bin_search('xsltproc'),
'command_line' : '%(transform)s %(input)s',
'inputs' : ('text/xml',),
'output' : 'text/html',
'output_encoding' : 'UTF-8',
'dtds' : {
'-//OASIS//DTD DocBook V4.1//EN' : '/usr/share/sgml/docbook/xsl-stylesheets-1.29/html/docbook.xsl'
},
'default_transform': ''
}
self.config_metadata = {
'binary_path' : ('string', 'Binary path',
'Path of the executable on the server.'),
'command_line' : ('string', 'Command line',
'''Additional command line option.
There should be at least the input file (designed by "%(input)s") and the xsl
file (designed by "%(transform)s").The transformation\'s result must be printed on stdout.
'''),
'inputs' : ('list', 'Inputs', 'Input(s) MIME type. Change with care.'),
'output' : ('string', 'Output', 'Output MIME type. Change with care.'),
'output_encoding': ('string', 'Output encoding', 'Output encoding.'),
'dtds' : ('dict', 'DTDs',
'Association of public ids or dtds to XSL transformations.',
('Public id', 'XSLT path')),
'default_transform' : ('string', 'Default xslt',
'Default xslt, used when no specific transformation is found.'),
}
self.config.update(kwargs)
if name:
self.__name__ = name
def __getattr__(self, attr):
if attr == 'inputs':
return self.config['inputs']
if attr == 'output':
return self.config['output']
if attr == 'output_encoding':
return self.config['output_encoding']
raise AttributeError(attr)
def convert(self, data, cache, **kwargs):
base_name = sansext(kwargs.get("filename") or 'unknown.xml')
dtds = self.config['dtds']
tmpdir, fullname = self.initialize_tmpdir(data, filename=base_name)
try:
try:
doctype = get_doctype(data)
except DTException:
try:
doctype = get_dtd(data)
except DTException:
log('Unable to get doctype nor dtd in %s' % data)
doctype = None
if doctype and dtds.has_key(doctype):
data = self.invokeCommand(fullname, dtds[doctype])
elif self.config['default_transform']:
data = self.invokeCommand(fullname, self.config['default_transform'])
cache.setData(data)
path, images = self.subObjects(tmpdir)
objects = {}
if images:
self.fixImages(path, images, objects)
cache.setSubObjects(objects)
return cache
finally:
self.cleanDir(tmpdir)
def invokeCommand(self, input_name, xsl):
dest_dir = dirname(input_name)
output_file = join(dirname(input_name), 'tr_output')
command = '%(binary_path)s %(command_line)s' % self.config
data = {'input': input_name, 'output': output_file, 'transform': xsl}
system(command % data)
if exists(output_file):
data = open(output_file).read()
else:
data = 'error occurs during transform. See error log'
return data
def register():
return XsltTransform()
DT_RGX = re.compile('<!DOCTYPE \w* PUBLIC \"([^"]*)\" \"([^"]*)\"')
DT_RGX2 = re.compile('<!DOCTYPE \w* SYSTEM \"([^"]*)\"')
class DTException(Exception): pass
def get_doctype(data):
""" return the public id for the doctype given some raw xml data
"""
if not hasattr(data, 'readlines'):
data = StringIO(data)
for line in data.readlines():
line = line.strip()
if not line:
continue
if line.startswith('<?xml') or line.startswith('<!-- '):
continue
m = DT_RGX.match(line)
if m is not None:
return m.group(1)
else:
raise DTException('Unable to match doctype in "%s"' % line)
def get_dtd(data):
""" return the public id for the doctype given some raw xml data
"""
if not hasattr(data, 'readlines'):
data = StringIO(data)
for line in data.readlines():
line = line.strip()
if not line:
continue
if line.startswith('<?xml') or line.startswith('<!-- '):
continue
m = DT_RGX.match(line)
if m is not None:
return m.group(2)
m = DT_RGX2.match(line)
if m is not None:
return m.group(1)
else:
raise DTException('Unable to match doctype in "%s"' % line)
if __name__ == '__main__':
print get_doctype('''<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE article PUBLIC "-//LOGILAB/DTD DocBook V4.1.2-Based Extension V0.1//EN" "dcbk-logilab.dtd" []>
<book id="devtools_user_manual" lang="fr">
''')
"""some common utilities
"""
import logging
from time import time
from types import UnicodeType, StringType
STRING_TYPES = (UnicodeType, StringType)
class TransformException(Exception):
pass
FB_REGISTRY = None
# logging function
logger = logging.getLogger('PortalTransforms')
def log(message, severity=logging.INFO):
logger.log(severity, message)
# directory where template for the ZMI are located
import os.path
_www = os.path.join(os.path.dirname(__file__), 'www')
skins_dir = os.path.join(os.path.dirname(__file__), 'skins')
from zExceptions import BadRequest
# directory where template for the ZMI are located
import os.path
_www = os.path.join(os.path.dirname(__file__), 'www')
skins_dir = None
1.4.0-final
\ No newline at end of file
<h1 tal:replace="structure here/manage_page_header|nothing">Header</h1>
<h2 tal:define="manage_tabs_message options/manage_tabs_message | nothing"
tal:replace="structure here/manage_tabs">Tabs</h2>
<form method="POST" action="manage_addTransform"
tal:attributes="action string:${here/absolute_url}/manage_addTransform;">
<div class="form-title">
Add a new transform
</div>
<div tal:define="status python:request.get('portal_status', '')"
tal:condition="status"
class="error"
tal:content="status"
/>
<table width="50%">
<tr>
<td> ID</td>
<td>
<input name="id" tal:attributes="value python:request.get('id', '');"/>
</td>
</tr><tr>
<td>
Module
</td>
<td>
<input name="module" tal:attributes="value python:request.get('module', '');"/>
</td>
</tr>
</table>
<input type="submit"/>
</form>
<tal:footer tal:replace="structure here/manage_page_footer|nothing">footer</tal:footer>
<h1 tal:replace="structure here/manage_page_header|nothing">Header</h1>
<h2 tal:define="manage_tabs_message options/manage_tabs_message | nothing"
tal:replace="structure here/manage_tabs">Tabs</h2>
<form method="POST" action="manage_addTransform"
tal:attributes="action string:${here/absolute_url}/manage_addTransformsChain;">
<div class="form-title">
Add a new transforms chain
</div>
<div tal:define="status python:request.get('portal_status', '')"
tal:condition="status"
class="error"
tal:content="status"
/>
<table width="50%">
<tr>
<td> ID</td>
<td>
<input name="id" tal:attributes="value python:request.get('id', '');"/>
</td>
</tr><tr>
<td>
Description
</td>
<td>
<textarea name="description" tal:content="request/description | nothing"/>
</td>
</tr>
</table>
<input type="submit"/>
</form>
<tal:footer tal:replace="structure here/manage_page_footer|nothing">footer</tal:footer>
<h1 tal:replace="structure here/manage_page_header|nothing">Header</h1>
<h2 tal:define="manage_tabs_message options/manage_tabs_message | nothing"
tal:replace="structure here/manage_tabs">Tabs</h2>
<form method="POST"
tal:attributes="action string:${here/absolute_url}/set_parameters"
tal:define="params here/get_parameters">
<div class="form-title">
Configure transform
</div>
<p>Transform inputs : <b tal:content="python:', '.join(here.inputs)"/>
<p>Transform output : <b tal:content="here/output"/>
<p tal:condition="here/get_documentation" tal:content="here/get_documentation"/>
<div tal:define="status python:request.get('portal_status', '')"
tal:condition="status"
tal:content="status"
class="error" />
<tal:block tal:condition="params">
<table width="80%">
<tr tal:repeat="param params">
<tal:block tal:define="meta python:here.get_parameter_infos(param);
type python:meta[0];
widget string:here/tr_widgets/macros/${type}_widget;
label python:meta[1] or param;">
<td tal:content="label">Parameter's label</td>
<td>
<metal:block metal:use-macro="python:path(widget)" />
</td>
<td tal:content="python:meta[2]">
field description
</td>
</tal:block>
</tr>
</table>
<input type="submit"/>
</tal:block>
<p tal:condition="not:params">
This transform has no configurable parameters.
</p>
</form>
<tal:footer tal:replace="structure here/manage_page_footer|nothing">footer</tal:footer>
<h1 tal:replace="structure here/manage_page_header|nothing">Header</h1>
<h2 tal:define="manage_tabs_message options/manage_tabs_message | nothing"
tal:replace="structure here/manage_tabs">Tabs</h2>
<div class="form-title">
Transformation policy
</div>
<div>
This page allow you to configure which transforms do you want to be applied
for a give output MIME type.
</div>
<hr/>
<form method="POST" action="manage_addPolicy">
<table>
<tr>
<th align="left">output type</th>
<th align="left">use transforms</th>
</tr>
<tr>
<td>
<select name="output_mimetype"
tal:define="mimetypes here/mimetypes_registry/list_mimetypes;
dummy mimetypes/sort">
<option tal:repeat="mimetype mimetypes"
tal:attributes="value mimetype;"
tal:content="mimetype"/>
</select>
</td>
<td>
<select name="required_transforms:list" multiple="multiple">
<option tal:repeat="id here/objectIds"
tal:attributes="value id;"
tal:content="id"/>
</select>
</td>
</tr>
</table>
<input type="submit" value="add"/>
</form>
<hr/>
<form method="POST" action="manage_delPolicies"
tal:define="policies here/listPolicies" tal:condition="policies">
<table>
<tr>
<th colspan="2" align="left">output type</th>
<th align="left">use transforms</th>
</tr>
<tr tal:repeat="policy policies">
<td>
<input type="checkbox" name="outputs:list"
tal:attributes="value python:policy[0]"/>
</td>
<td tal:content="python:policy[0]">
</td>
<td tal:content="python:', '.join(policy[1])">
</td>
</tr>
</table>
<input type="submit" value="delete selected"/>
</form>
<tal:footer tal:replace="structure here/manage_page_footer|nothing">footer</tal:footer>
<h1 tal:replace="structure here/manage_page_header|nothing">Header</h1>
<h2 tal:define="manage_tabs_message options/manage_tabs_message | nothing"
tal:replace="structure here/manage_tabs">Tabs</h2>
<div class="form-title">
Chain <span tal:replace="here/title_or_id"/>.
</div>
<div align="right">
<form method="POST" action="manage_addObject">
<select name="id">
<option tal:repeat="id here/listAddableObjectIds"
tal:attributes="value id;"
tal:content="id"/>
</select>
<input type="submit" value="Add"/>
</form>
</div>
<p>Transform inputs : <b tal:content="python:', '.join(here.inputs)"/>
<p>Transform output : <b tal:content="here/output"/>
<div tal:content="here/description"/>
<div tal:define="status request/portal_status' | nothing"
tal:condition="status" class="error"
tal:content="status" />
<form method="POST" action="manage_delObjects"
tal:define="transforms here/objectValues" tal:condition="transforms">
<table width="60%" >
<tr>
<th colspan="2">transform</th>
<th>input</th>
<th>output</th>
<th colspan="2">&nbsp;</th>
</tr>
<tr tal:repeat="tr transforms">
<td>
<input type="checkbox" name="ids:list"
tal:attributes="value tr/getId"/>
</td>
<td tal:content="tr/title_or_id">
</td>
<td tal:content="python:tr.inputs[0]">
</td>
<td tal:content="tr/output">
</td>
<td>
<a tal:attributes="href string:${here/absolute_url}/move_object_down?id=${tr/getId}"
tal:condition="not:repeat/tr/end">
<img tal:attributes="src string:${here/portal_url}/down.png"/>
</a>
</td>
<td>
<a tal:attributes="href string:${here/absolute_url}/move_object_up?id=${tr/getId}"
tal:condition="not:repeat/tr/start">
<img tal:attributes="src string:${here/portal_url}/up.png"/>
</a>
</td>
</tr>
</table>
<input type="submit" value="delete selected"/>
</form>
<tal:footer tal:replace="structure here/manage_page_footer|nothing">footer</tal:footer>
<h1 tal:replace="structure here/manage_page_header|nothing">Header</h1>
<h2 tal:define="manage_tabs_message options/manage_tabs_message | nothing"
tal:replace="structure here/manage_tabs">Tabs</h2>
<p>
Those Transformations have been reloaded :
</p>
<table border="1" tal:define="reloaded here/reloadTransforms">
<tr><th>name</th><th>module</th></tr>
<tr tal:repeat="tuple reloaded">
<td tal:content="python:tuple[0]"/>
<td tal:content="python:tuple[1]"/>
</tr>
</table>
<tal:footer tal:replace="structure here/manage_page_footer|nothing">footer</tal:footer>
<h1 tal:replace="structure here/manage_page_header|nothing">Header</h1>
<h2 tal:define="manage_tabs_message options/manage_tabs_message | nothing"
tal:replace="structure here/manage_tabs">Tabs</h2>
<p tal:define="dummy here/reload">Transformation reloaded <span
tal:condition="exists:here/module" tal:omit-tag="">(module <b tal:content="here/module"/>)</span>
</p>
<tal:footer tal:replace="structure here/manage_page_footer|nothing">footer</tal:footer>
<h1 tal:replace="structure here/manage_page_header|nothing">Header</h1>
<h2 tal:define="manage_tabs_message options/manage_tabs_message | nothing"
tal:replace="structure here/manage_tabs">Tabs</h2>
<form method="POST" action="manage_setCacheValidityTime"
tal:attributes="action
string:${here/absolute_url}/manage_setCacheValidityTime;">
<div class="form-title">
Manage transformation caches
</div>
<div tal:define="status python:request.get('portal_status', '')"
class="error"
tal:condition="status"
tal:content="status" />
<table width="50%">
<tr>
<td>Lifetime of objects in cache (in seconds). O means infinity</td>
<td>
<input name="seconds" tal:attributes="value request/seconds | here/max_sec_in_cache | string:0;"/>
</td>
</tr>
</table>
<input type="submit"/>
</form>
<tal:footer tal:replace="structure here/manage_page_footer|nothing">footer</tal:footer>
<div metal:define-macro="int_widget">
<input tal:attributes="name param;
value python:here.get_parameter_value(param);"/>
</div>
<div metal:define-macro="string_widget">
<input size="80"
tal:attributes="name param;
value python:here.get_parameter_value(param);"/>
</div>
<div metal:define-macro="list_widget">
<textarea tal:attributes="name string:${param}:lines;"
tal:content="python: '\n'.join(here.get_parameter_value(param))"/>
</div>
<div metal:define-macro="dict_widget">
<table tal:define="titles python:meta[3];">
<tr>
<th tal:content="python:titles[0]"/>
<th tal:content="python:titles[1]"/>
</tr><tr tal:define="values python:here.get_parameter_value(param).items();
dummy python:values.sort()"
tal:repeat="key_val values">
<td><input type='text' size="30"
tal:attributes="name python:param + '_key';
value python:key_val[0]"/></td>
<td><input type='text' size="50"
tal:attributes="name python:param + '_value';
value python:key_val[1]"/></td>
</tr><tr>
<td><input type='text' size="30"
tal:attributes="name python:param + '_key'"/></td>
<td><input type='text' size="50"
tal:attributes="name python:param + '_value'"/></td>
</tr><tr>
<td><input type='text' size="30"
tal:attributes="name python:param + '_key'"/></td>
<td><input type='text' size="50"
tal:attributes="name python:param + '_value'"/></td>
</tr>
</table>
</div>
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment