Microsoft Word formatting

Forum for users and developers of Bullhorn's API service.

Moderators: StaffingSupport, s.emmons, BullhornSupport

Post Reply
Posts: 6
Joined: Wed Dec 12, 2012 6:39 am

Microsoft Word formatting

Post by scdk »

We are pulling the description field from Bullhorn but there is lots of bad html formatting as the job descriptions have been pasted from Word. I wondered if anyone has a regex they could share to remove this or know of a function that can help.

There are also other items, such as bullets are coming through as

Code: Select all

<span style="font-family: Symbol; mso-fareast-font-family: Symbol; mso-bidi-font-family: Symbol;"><span style="mso-list: Ignore;">&Acirc;&middot;<span style='font: 7pt/normal "Times New Roman"; font-size-adjust: none; font-stretch: normal;'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span></span></span>
which is showing on the website as · which looks awful.

It all looks ok within Bullhorn but when pulled through is nasty, the charset on the page is <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />.

Thanks for any help
Post Reply